[ 
https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165869#comment-13165869
 ] 

stack commented on HBASE-4988:
------------------------------

Which hbase version Chunhui?  In 0.92 we should be retrying (the putToMetaTable 
should be retrying and will give up only if .META. really gond)?  This change 
has no retry facility and would seem to be a regression?

{code}
+    metaServer.put(CatalogTracker.META_REGION_NAME, put);
{code}

The .meta. server could die between verification and your doing above.

Maybe have putToMetaTable return true/false?

Also, I don't think this will work:

{code}
+        toPONR = MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
+            this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
{code}

In the past, I've seen weird case where the offlineParentInMeta looked like it 
failed but all that happened was that the client timeout trying to do the 
.META. put -- the edit actually went in anyways.  Aborting the regionserver is 
pretty radical but having the PONR before we do the meta edit would at least 
make it so if the edit went in, we'd likely find out during server recovery.

We used to have PONR after the above but this patch changed it: HBASE-4562

You have a point though that currently if .META. goes away long enough, we 
could abort all servers (this is 0.92?).  This is pretty serious issue.
                
> MetaServer crash cause all splitting regionserver abort
> -------------------------------------------------------
>
>                 Key: HBASE-4988
>                 URL: https://issues.apache.org/jira/browse/HBASE-4988
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>         Attachments: hbase-4988v1.patch
>
>
> If metaserver crash now,
> All the splitting regionserver will abort theirself.
> Becasue the code
> {code}
> this.journal.add(JournalEntry.PONR);
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>             this.parent.getRegionInfo(), a.getRegionInfo(), 
> b.getRegionInfo());
> {code}
> If the JournalEntry is PONR, split's roll back will abort itselef.
> It is terrible in huge putting environment when metaserver crash

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to