Viraj Jasani created HBASE-26596:
------------------------------------

             Summary: region_mover should gracefully ignore null response from 
RSGroupAdmin#getRSGroupOfServer
                 Key: HBASE-26596
                 URL: https://issues.apache.org/jira/browse/HBASE-26596
             Project: HBase
          Issue Type: Bug
          Components: mover, rsgroup
    Affects Versions: 1.7.1
            Reporter: Viraj Jasani


If regionserver has any non-daemon thread running even after it's own shutdown, 
the running non-daemon thread can prevent clean JVM exit and regionserver could 
be stuck in the zombie state. We have recently provided a workaround for this 
in HBASE-26468 for regionserver exit hook to wait 30s for all non-daemon 
threads to get stopped before terminating JVM abnormally.

However, if regionserver is stuck in such state, region_mover unload fails with:
{code:java}
NoMethodError: undefined method `getName` for nil:NilClass
  getSameRSGroupServers at /bin/region_mover.rb:503
             __ensure__ at /bin/region_mover.rb:313 
          unloadRegions at /bin/region_mover.rb:310               
                 (root) at /bin/region_mover.rb:572               
 {code}
This happens if the cluster has RSGroup enabled and the given server is already 
stopped, hence RSGroupAdmin#getRSGroupOfServer would return null (as the server 
is not running anymore so it is not part of any RSGroup). region_mover should 
ride over this null response and gracefully exit from unloadRegions() call.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to