[
https://issues.apache.org/jira/browse/HBASE-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701808#comment-14701808
]
Nick Dimiduk commented on HBASE-14237:
--------------------------------------
Hi [~liushaohui]. I have to ask the question -- any chance you can upgrade to a
0.98+ release? If I remember the history correctly, our Chaos Monkey test
harness was introduced after 0.94. Many bugs in AM have been found and fixed
via this testing on newer branches. We've back ported what's been possible, but
at this point the AM in 0.94 is very different from more recent versions.
> Meta region may be onlined on multi regonservers for bugs of assigning meta
> ---------------------------------------------------------------------------
>
> Key: HBASE-14237
> URL: https://issues.apache.org/jira/browse/HBASE-14237
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.11
> Reporter: Liu Shaohui
> Assignee: Liu Shaohui
> Priority: Critical
> Attachments: meta.log
>
>
> When a regionserver failed to open the meta region and crash after setting
> the RS_ZK_REGION_FAILED_OPEN state of meta region in zookeeper, the master
> will handle the event of RS_ZK_REGION_FAILED_OPEN and try to assign the meta
> region again in AssignmentManager#handleRegion. But at the same time, the
> master will handle the regionserver expired event and start a
> MetaServerShutdownHandler for the regionserver, because the servername of
> regionserver is same as the servername of the unassigned node of meta region.
> In the MetaServerShutdownHandler, the meta region may be assigned for second
> time.
> [~heliangliang]
> We have encountered this problem in our production cluster which resulted in
> inconsistency of region location in meta table. You can see the log from the
> attachment.
> The code of AssignmentManager is so complex and I have not get a solution to
> fix this problem. Could someone kindly help to give some suggestions? Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)