Hi Doug,

The multiple maintenance thread crash is easy to reproduce: just set
Hypertable.RangeServer.MaintenanceThreads=2, start all servers locally
on a single node and run random_write_test 10000000000. The range
server will crash in a minute. But the reason is sort of hard to
track.

What we know till now:
1. The bug is introduced in version 0.9.0.11. Former versions doesn't
have this problem
2. According to RangeServer.log, the crash usually happens when two
adjacent ranges are both splitting in two maintenance threads
concurrently. If we forbid this behavior by modifying
MaintenanceTaskQueue code, the crash problem is gone, but the reason
is unknown. (Pheonix discovered this)
3. Sometimes the Range Server fails at HT_EXPECT
(m_immutable_cache_ptr, Error::FAILED_EXPECTATION); in
AccessGroup::run_compaction(). m_immutable_cache_ptr is set to 0 in
multiple places with m_mutex locked, but not always checked in a
locked environment, which is doubtable.

Do you have any idea based on these facts?

Donald


On Nov 25, 2:46 am, "Doug Judd" <[EMAIL PROTECTED]> wrote:
> Hello Phoenix,
>
> Thank you!!!  This is fantastic.  Luke wrote a paged memory allocator
> (src/cc/Common/CharArena.cc/h) and it's been on my todo list to integrate
> it.  I will merge his work with yours and try to get it into the next
> release.
>
> BTW, if you could figure out the multiple maintenance thread crash, that
> would be very much appreciated.  Thanks again.
>
> - Doug
>
> 2008/11/24 Phoenix <[EMAIL PROTECTED]>
>
>
>
> > sorry, the patch above has a little problem, I upload a earlier
> > version. This one
>
> >http://hypertable-dev.googlegroups.com/web/mem-pool.patch?hl=en&gsc=-...
> > is OK.
>
> > On Nov 24, 10:00 pm, Phoenix <[EMAIL PROTECTED]> wrote:
> > > Hi Doug,
> > >   In our using of Hypertable, its memory usage is too large. We tested
> > > it and found that the major problem laied in the CellCache. The data
> > > below is from the google heap profiler:
>
> > > <Test Environmet: 16GB Mem, Intel(R) Xeon(R) [EMAIL PROTECTED] * 4, rhel
> > > as4u3>
>
> > >   Function (during
> > > execution)                                                      Memory
> > > Usage
>
> > > Hypertable::CellCache::add
> > > 75.6%
>
> > > __gnu_cxx::new_allocator::allocate
> > > 18.8%
>
> > > Hypertable::DynamicBuffer::grow
> > > 4.1%
>
> > > Hypertable::IOHandlerData::handle_event
> > > 1.0%
>
> > > Hypertable::BlockCompressionCodecLzo::BlockCompressionCodecLzo
> > > 0.5%
>
> > >   We found that the main problem laid in the CellCache(the second one
> > > "allocate" is called by CellMap, which is also in the CellCache). And
> > > after a long time of inserting data, the memory usage keeps a very
> > > high level, which we thought should be freed after doing some
> > > compaction work. In our a ten-server cluster, one range(in this case
> > > we set only a  AccessGroup for each table) used about 32MB. And the
> > > memory is never freed.
>
> > >   After we made some tests and experiments, we implemented a memory
> > > pool for CellCache. After about one week's tests, it works well and
> > > effciently. In the some cluster as mentioned above, each range only
> > > use  about 1.2MB on average, after very short time of the completing
> > > of inserting.
>
> > >   We compare it with the standard version in a single server. In the
> > > standard version, whether use tcmalloc or not (tcmalloc can help some,
> > > it can reduce about 30% of the standard one), the memory usage never
> > > falls down. On contrast, the pool version's memory usage go down
> > > quickly after the inserting is down.
> > >   In the comparation, we insert about 11G data into the hypertable
> > > (about 33 ranges after parsing and inserting). The memory usage in
> > > this process can be seen here <the image and patch is uploaded in the
> > > "Files" of this group>
>
> > >http://hypertable-dev.googlegroups.com/web/RS%20Mem-Usage%20Comparati...
> > >   The purple one we use our pool both for <key,value> pairs and the
> > > CellMap; the yellow one is only for the <key, value> pairs. As seen
> > > from this image, the pool version is very excellent in memory usage.
> > >   And the patch's link ishttp://
> > groups.google.com/group/hypertable-dev/web/mem-pool.patch.tgz?...
>
> > >   We use google heap profiler for the pool version and get the
> > > following data:
>
> > > Function (during execution)      Mem  Usage
> > > CellCachePool::get_memory        94.3%
>
> > > Hypertable::
> > > DynamicBuffer::grow              3.8%
> > > Hypertable
> > > ::BlockCompressionCodecLzo
> > > ::BlockCompressionCodecLzo       1.1%
> > > Hypertable
> > > ::IOHandlerData
> > > ::handle_event                   0.5%
>
> > >   BTW, in our tests, the RangeServer crashed when we set
> > > Hypertable.RangeServer.MaintenanceThreads=4 . We test 0.9.0.11 and
> > > 0.9.0.12, both of them have this problem and this week we want to make
> > > more test about it.
>
> > >   We hope this can help you.
>
> > >   Best wishes.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to