I'm working on this today.  - Doug

On Tue, Dec 2, 2008 at 7:10 PM, donald <[EMAIL PROTECTED]> wrote:

>
> Hi Doug,
>
> I thinks it's better to open a new thread on this topic :)
>
> The multiple maintenance thread crash is easy to reproduce: just set
> Hypertable.RangeServer.MaintenanceThreads=2, start all servers locally
> on a single node and run random_write_test 10000000000. The range
> server will crash in a minute. But the reason is sort of hard to
> track.
>
> What we know till now:
> 1. The bug is introduced in version 0.9.0.11. Former versions doesn't
> have this problem
> 2. According to RangeServer.log, the crash usually happens when two
> adjacent ranges are both splitting in two maintenance threads
> concurrently. If we forbid this behavior by modifying
> MaintenanceTaskQueue code, the crash problem is gone, but the reason
> is unknown. (Pheonix discovered this)
> 3. Sometimes the Range Server fails at HT_EXPECT
> (m_immutable_cache_ptr, Error::FAILED_EXPECTATION); in
> AccessGroup::run_compaction(). m_immutable_cache_ptr is set to 0 in
> multiple places with m_mutex locked, but not always checked in a
> locked environment, which is doubtable.
>
> Do you have any idea based on these facts?
>
> Donald
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to