[
https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167018#comment-13167018
]
chunhui shen commented on HBASE-4988:
-------------------------------------
@stack
you could try the scenario in the following:
1.kill meta server, and don't start it
2.do manual split on all other regionservers
In step1, the .meta. is redeployed after 3 minutes in default,
so in step2, the retry time at least 3 minutes,otherwise server will abort.
Howerver, in 0.90, because of no retry, server must abort itself.
The PONR is used to prevent data loss when failed updating meta. But if we are
sure the .meta. is not updated, the PONR is not needed in fact. It is what the
patch doing.
we have tried in 0.90 for the patch, if meta server crash before doing
OfflineParentInMeta, the server will not abort when rolling back.
> MetaServer crash cause all splitting regionserver abort
> -------------------------------------------------------
>
> Key: HBASE-4988
> URL: https://issues.apache.org/jira/browse/HBASE-4988
> Project: HBase
> Issue Type: Bug
> Reporter: chunhui shen
> Attachments: hbase-4988v1.patch
>
>
> If metaserver crash now,
> All the splitting regionserver will abort theirself.
> Becasue the code
> {code}
> this.journal.add(JournalEntry.PONR);
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
> this.parent.getRegionInfo(), a.getRegionInfo(),
> b.getRegionInfo());
> {code}
> If the JournalEntry is PONR, split's roll back will abort itselef.
> It is terrible in huge putting environment when metaserver crash
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira