[
https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165898#comment-13165898
]
chunhui shen commented on HBASE-4988:
-------------------------------------
@stack
The case happens in our test environment which use 0.90 version.
If .META. Server is killed and not started immediately.
the step MetaEditor.offlineParentInMeta will fail and throw exception,
and the JournalEntry.PONR causes server abort when rolling back.
In Trunk version, MetaEditor.offlineParentInMeta will retry, but the parent
region can't on service for a long time, I think it is unacceptable. Also the
retry would be failed, and cause server abort finally.
{code}metaServer.put(CatalogTracker.META_REGION_NAME, put);{code}
If the .meta. server die between verification and doing put above, it will
abort because we can't ensure whether update .meta. successfully. However, if
we can find that .meta. server is not ok now first, we needn't abort server
which is doing split
> MetaServer crash cause all splitting regionserver abort
> -------------------------------------------------------
>
> Key: HBASE-4988
> URL: https://issues.apache.org/jira/browse/HBASE-4988
> Project: HBase
> Issue Type: Bug
> Reporter: chunhui shen
> Attachments: hbase-4988v1.patch
>
>
> If metaserver crash now,
> All the splitting regionserver will abort theirself.
> Becasue the code
> {code}
> this.journal.add(JournalEntry.PONR);
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
> this.parent.getRegionInfo(), a.getRegionInfo(),
> b.getRegionInfo());
> {code}
> If the JournalEntry is PONR, split's roll back will abort itselef.
> It is terrible in huge putting environment when metaserver crash
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira