[hypertable-dev] Crash when Hypertable.RangeServer.MaintenanceThreads > 1

donald Tue, 02 Dec 2008 19:10:32 -0800

Hi Doug,

I thinks it's better to open a new thread on this topic :)


The multiple maintenance thread crash is easy to reproduce: just set
Hypertable.RangeServer.MaintenanceThreads=2, start all servers locally
on a single node and run random_write_test 10000000000. The range
server will crash in a minute. But the reason is sort of hard to
track.

What we know till now:
1. The bug is introduced in version 0.9.0.11. Former versions doesn't
have this problem
2. According to RangeServer.log, the crash usually happens when two
adjacent ranges are both splitting in two maintenance threads
concurrently. If we forbid this behavior by modifying
MaintenanceTaskQueue code, the crash problem is gone, but the reason
is unknown. (Pheonix discovered this)
3. Sometimes the Range Server fails at HT_EXPECT
(m_immutable_cache_ptr, Error::FAILED_EXPECTATION); in
AccessGroup::run_compaction(). m_immutable_cache_ptr is set to 0 in
multiple places with m_mutex locked, but not always checked in a
locked environment, which is doubtable.

Do you have any idea based on these facts?

Donald
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[hypertable-dev] Crash when Hypertable.RangeServer.MaintenanceThreads > 1

Reply via email to