Hi Doug,
  In our using of Hypertable, its memory usage is too large. We tested
it and found that the major problem laied in the CellCache. The data
below is from the google heap profiler:

<Test Environmet: 16GB Mem, Intel(R) Xeon(R) [EMAIL PROTECTED] * 4, rhel
as4u3>

  Function (during
execution)                                                      Memory
Usage
 
Hypertable::CellCache::add
75.6%
 
__gnu_cxx::new_allocator::allocate
18.8%
 
Hypertable::DynamicBuffer::grow
4.1%
 
Hypertable::IOHandlerData::handle_event
1.0%
 
Hypertable::BlockCompressionCodecLzo::BlockCompressionCodecLzo
0.5%

  We found that the main problem laid in the CellCache(the second one
"allocate" is called by CellMap, which is also in the CellCache). And
after a long time of inserting data, the memory usage keeps a very
high level, which we thought should be freed after doing some
compaction work. In our a ten-server cluster, one range(in this case
we set only a  AccessGroup for each table) used about 32MB. And the
memory is never freed.

  After we made some tests and experiments, we implemented a memory
pool for CellCache. After about one week's tests, it works well and
effciently. In the some cluster as mentioned above, each range only
use  about 1.2MB on average, after very short time of the completing
of inserting.



  We compare it with the standard version in a single server. In the
standard version, whether use tcmalloc or not (tcmalloc can help some,
it can reduce about 30% of the standard one), the memory usage never
falls down. On contrast, the pool version's memory usage go down
quickly after the inserting is down.
  In the comparation, we insert about 11G data into the hypertable
(about 33 ranges after parsing and inserting). The memory usage in
this process can be seen here <the image and patch is uploaded in the
"Files" of this group>
 
http://hypertable-dev.googlegroups.com/web/RS%20Mem-Usage%20Comparation.jpg?hl=en&gsc=VfOkFRYAAABddvouDpQ7Of_lV57X48Dk57an5Fe8QJeePd7zpGv9tg
  The purple one we use our pool both for <key,value> pairs and the
CellMap; the yellow one is only for the <key, value> pairs. As seen
from this image, the pool version is very excellent in memory usage.
  And the patch's link is 
http://groups.google.com/group/hypertable-dev/web/mem-pool.patch.tgz?hl=en

  We use google heap profiler for the pool version and get the
following data:

Function (during execution)      Mem  Usage
CellCachePool::get_memory        94.3%

Hypertable::
DynamicBuffer::grow              3.8%
Hypertable
::BlockCompressionCodecLzo
::BlockCompressionCodecLzo       1.1%
Hypertable
::IOHandlerData
::handle_event                   0.5%


  BTW, in our tests, the RangeServer crashed when we set
Hypertable.RangeServer.MaintenanceThreads=4 . We test 0.9.0.11 and
0.9.0.12, both of them have this problem and this week we want to make
more test about it.

  We hope this can help you.

  Best wishes.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to