[ 
https://issues.apache.org/jira/browse/HBASE-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973471#action_12973471
 ] 

stack commented on HBASE-3343:
------------------------------

@jdcryans +1 on your patch though its addressing other than whats in this log; 
log is about an aborted server not going down.  Code looks like this in the 
abort processing:

{code}
....
 629     if (this.killed) {
 630       // Just skip out w/o closing regions.
 631     } else if (abortRequested) {
 632       if (this.fsOk) {
 633         closeAllRegions(abortRequested); // Don't leave any open file 
handles
 634         closeWAL(false);
 635       }
 636       LOG.info("aborting server at: " + this.serverInfo.getServerName());
 637     } else {
 638       closeAllRegions(abortRequested);
 639       closeWAL(true);
 640       closeAllScanners();
 641       LOG.info("stopping server at: " + this.serverInfo.getServerName());
 642     }
 643     // Interrupt catalog tracker here in case any regions being opened out 
in
 644     // handlers are stuck waiting on meta or root.
 645     if (this.catalogTracker != null) this.catalogTracker.stop();
 646     waitOnAllRegionsToClose();
....
{code}

... so if an abort is requested AND the fs is NOT OK, then we won't close 
regions... we just skip out.  ONLY, we then fall into waitOnAllRegionsToClose 
on line #646 above which will never complete because we didn't do close on 
regions.

I think simplest fix is adding this:

{code}
 646     if (this.fsOK) waitOnAllRegionsToClose();
{code}


Let me commit your patch and mine together.







> Server not shutting down after losing log lease
> -----------------------------------------------
>
>                 Key: HBASE-3343
>                 URL: https://issues.apache.org/jira/browse/HBASE-3343
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3343.patch, shutdown-logs.txt.bz2, stuck-server.txt
>
>
> Ran into this bug testing 0.90rc2. I kill -STOPed a server, and then -CONT it 
> after its logs had been split. It correctly decided it should abort, but got 
> stuck during the shutdown process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to