[ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Javier Akira Luca de Tena updated HBASE-25594:
----------------------------------------------
    Description: 
We usually use graceful_stop.sh from the Master to restart RegionServers. 
However, in some scenarios we may not have privileges to restart remote 
RegionServers (it uses ssh).
 But we can still use graceful_stop.sh on the same host we want to restart.

In order to detect the execution at localhost, graceful_stop.sh uses 
/bin/hostname.
 
[https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]

When RegionMover strips the host to not include it in the list of target hosts, 
we filter it out by checking all RegionServer hosts in the cluster:
 
[https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
 
[https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]

But the list of RegionServer hosts returned by Admin#getRegionServers are FDQN, 
while the hostname provided from graceful_stop.sh is not FDQN, making the 
comparison fail.

Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
my environment: 
 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305
 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175
] 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]

 

This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
script.

Will provide patch soon.

  was:
We usually use graceful_stop.sh from the Master to restart RegionServers. 
However, in some scenarios we may not have privileges to restart remote 
RegionServers (it uses ssh).
 But we can still use graceful_stop.sh on the same host we want to restart.

In order to detect the execution at localhost, graceful_stop.sh uses 
/bin/hostname.
 
[https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]

When RegionMover strips the host to not include it in the list of target hosts, 
we filter it out by checking all RegionServer hosts in the cluster:
 
[https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
 
[https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]

But the list of RegionServer hosts returned by Admin#getRegionServers are FDQN, 
while the hostname provided from graceful_stop.sh is not FDQN, making the 
comparison fail.

Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
my environment: 
 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305
https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175|https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
 
[https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]

 

This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
script.

Will provide patch soon.


> graceful_stop.sh fails to unload regions when ran at localhost
> --------------------------------------------------------------
>
>                 Key: HBASE-25594
>                 URL: https://issues.apache.org/jira/browse/HBASE-25594
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha-1, 1.4.13
>            Reporter: Javier Akira Luca de Tena
>            Assignee: Javier Akira Luca de Tena
>            Priority: Minor
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175
> ] 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to