[
https://issues.apache.org/jira/browse/HBASE-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702401#comment-14702401
]
Liu Shaohui commented on HBASE-14237:
-------------------------------------
[~ndimiduk]
{quote}
Any chance you can upgrade to a 0.98+ release?
{quote}
Yes, we are trying to upgrade our online hbase clusters to 0.98+. But for the
clusters are serving online applications, we must be very careful and it will
cost a long time.
Thanks for providing so useful info about AM.
With hacking some codes, I will validate if this issue exists in 0.98+
versions.
> Meta region may be onlined on multi regonservers for bugs of assigning meta
> ---------------------------------------------------------------------------
>
> Key: HBASE-14237
> URL: https://issues.apache.org/jira/browse/HBASE-14237
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.11
> Reporter: Liu Shaohui
> Assignee: Liu Shaohui
> Priority: Critical
> Attachments: meta.log
>
>
> When a regionserver failed to open the meta region and crash after setting
> the RS_ZK_REGION_FAILED_OPEN state of meta region in zookeeper, the master
> will handle the event of RS_ZK_REGION_FAILED_OPEN and try to assign the meta
> region again in AssignmentManager#handleRegion. But at the same time, the
> master will handle the regionserver expired event and start a
> MetaServerShutdownHandler for the regionserver, because the servername of
> regionserver is same as the servername of the unassigned node of meta region.
> In the MetaServerShutdownHandler, the meta region may be assigned for second
> time.
> [~heliangliang]
> We have encountered this problem in our production cluster which resulted in
> inconsistency of region location in meta table. You can see the log from the
> attachment.
> The code of AssignmentManager is so complex and I have not get a solution to
> fix this problem. Could someone kindly help to give some suggestions? Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)