I'm working on this today. - Doug On Tue, Dec 2, 2008 at 7:10 PM, donald <[EMAIL PROTECTED]> wrote:
> > Hi Doug, > > I thinks it's better to open a new thread on this topic :) > > The multiple maintenance thread crash is easy to reproduce: just set > Hypertable.RangeServer.MaintenanceThreads=2, start all servers locally > on a single node and run random_write_test 10000000000. The range > server will crash in a minute. But the reason is sort of hard to > track. > > What we know till now: > 1. The bug is introduced in version 0.9.0.11. Former versions doesn't > have this problem > 2. According to RangeServer.log, the crash usually happens when two > adjacent ranges are both splitting in two maintenance threads > concurrently. If we forbid this behavior by modifying > MaintenanceTaskQueue code, the crash problem is gone, but the reason > is unknown. (Pheonix discovered this) > 3. Sometimes the Range Server fails at HT_EXPECT > (m_immutable_cache_ptr, Error::FAILED_EXPECTATION); in > AccessGroup::run_compaction(). m_immutable_cache_ptr is set to 0 in > multiple places with m_mutex locked, but not always checked in a > locked environment, which is doubtable. > > Do you have any idea based on these facts? > > Donald > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
