[ 
https://issues.apache.org/jira/browse/HBASE-13627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173501#comment-15173501
 ] 

Sreeram Venkatasubramanian commented on HBASE-13627:
----------------------------------------------------

Hi [~ndimiduk] , I was able to reproduce the issue by testing in a slower 
machine. The problem occurs because in the below loop (HRegionServer.java) the 
region server repeatedly calls closeUserRegions() till all the regions become 
offline. Inside closeUserRegions() the regions are closed asynchronously, so it 
is possible that a given region is taking more time to close. For such a 
region, closeRegion() is called in the subsequent iteration of the while loop - 
thus triggering the INFO message that we see.

{code}
     while (!isStopped() && isHealthy()) {
        if (!isClusterUp()) {
          if (isOnlineRegionsEmpty()) {
            stop("Exiting; cluster shutdown set and not carrying any regions");
          } else if (!this.stopping) {
            this.stopping = true;
            LOG.info("Closing user regions");
            closeUserRegions(this.abortRequested);
          } else if (this.stopping) {
            boolean allUserRegionsOffline = areAllUserRegionsOffline();
            if (allUserRegionsOffline) {
              // Set stopped if no more write requests tp meta tables
              // since last time we went around the loop. Any open
              // meta regions will be closed on our way out.
              if (oldRequestCount == getWriteRequestCount()) {
                stop("Stopped; only catalog regions remaining online");
                break;
              }
              oldRequestCount = getWriteRequestCount();
            } else {
              // Make sure all regions have been closed -- some regions may
              // have not got it because we were splitting at the time of
              // the call to closeUserRegions.

              closeUserRegions(this.abortRequested);
            }
            LOG.debug("Waiting on " + getOnlineRegionsAsPrintableString());
          }
        }
{code}

Please let me know if we must leave the code as it is or if we need to track 
the regions to which closeRegion() call is made. So that during the next 
iteration, a duplicate closeRegion() call is not made - I can work on that if 
we want to do that.


> Terminating RS results in redundant CLOSE RPC
> ---------------------------------------------
>
>                 Key: HBASE-13627
>                 URL: https://issues.apache.org/jira/browse/HBASE-13627
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.1.0
>            Reporter: Nick Dimiduk
>            Assignee: Sreeram Venkatasubramanian
>            Priority: Minor
>              Labels: beginner, beginners
>             Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4
>
>
> Noticed while testing the 1.1.0RC0 bits. It seems we're issuing a redundant 
> close RPC during shutdown. This results in a logging warning for each region.
> {noformat}
> 2015-05-06 00:07:19,214 INFO  [RS:0;ndimiduk-apache-1-1-dist-6:56371] 
> regionserver.HRegionServer: Received CLOSE for the region: 
> 19cbe4fe2fe5335e7aace05e10e36ede, which we are already trying to CLOSE, but 
> not completed yet
> 2015-05-06 00:07:19,214 WARN  [RS:0;ndimiduk-apache-1-1-dist-6:56371] 
> regionserver.HRegionServer: Failed to close 
> cluster_test,66666666,1430869443384.19cbe4fe2fe5335e7aace05e10e36ede. - 
> ignoring and continuing
> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: The 
> region 19cbe4fe2fe5335e7aace05e10e36ede was already closing. New CLOSE 
> request is ignored.
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2769)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2695)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2327)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:937)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}
> 1. launch a standalone cluster from tgz (./bin/start-hbase.sh)
> 2. load some data (ie, run bin/hbase ltt)
> 3. terminate cluster (./bin/stop-hbase.sh)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to