[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212550#comment-16212550 ] Hudson commented on HBASE-10367: FAILURE: Integrated in Jenkins build HBase-2.0 #720 (See [https://builds.apache.org/job/HBase-2.0/720/]) HBASE-10367 RegionServer graceful stop / decommissioning (jerryjch: rev 75d2bba73969d84834f5cf15560ad0341af31d48) * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncDecommissionAdminApi.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/Admin.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/DrainingServerTracker.java * (edit) bin/draining_servers.rb * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncHBaseAdmin.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java * (edit) hbase-protocol-shaded/src/main/protobuf/Master.proto * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/RawAsyncHBaseAdmin.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperACL.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncAdmin.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * (delete) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncDrainAdminApi.java > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212412#comment-16212412 ] Hudson commented on HBASE-10367: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3920 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3920/]) HBASE-10367 RegionServer graceful stop / decommissioning (jerryjch: rev a43a00e89c5c99968a205208ab9a5307c89730b3) * (edit) bin/draining_servers.rb * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/RawAsyncHBaseAdmin.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncAdmin.java * (edit) hbase-protocol-shaded/src/main/protobuf/Master.proto * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncDecommissionAdminApi.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncHBaseAdmin.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java * (delete) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncDrainAdminApi.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/Admin.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/DrainingServerTracker.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperACL.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212270#comment-16212270 ] Anoop Sam John commented on HBASE-10367: Ya saw another jira now where AC hooks are added. Thanks for the explanation. Make sense. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212231#comment-16212231 ] Jerry He commented on HBASE-10367: -- Yes, the AC hooks are added. For the recommission, the region list is optional. (I tried to use the Optional, but in the other JIRA we try to avoid using Optional as parameter.) It is up to the user/caller. If it is not provided, then no region move. decommission and recommission need to pair up. But a normal graceful stop sequence would be: 0. get regions for the region server. 1. decommission call with offloading of the regions. 2. Stop the region server. 3. Start the region server (after patching, e.g.). 4. recommission call with the last list of regions. The current graceful-stop script does similar steps, but with manual region mover. We can get it to use the new APIs. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212181#comment-16212181 ] Anoop Sam John commented on HBASE-10367: Sorry did not check this patch any time.. Just seeing the Release notes. What abt the Access control mechanism for the new APIs? When AC is present, we will check? bq.void recommissionRegionServer(ServerName server, ListencodedRegionNames) The mentioned regions will get moved to this server eventually. What if previous decommission did not remove the regions from that server. Still more regions will come in? Sorry for asking late > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207030#comment-16207030 ] stack commented on HBASE-10367: --- +1 on patch. Nice. Make a nice release note [~jerryhe] for this nice addtion so others find it. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206970#comment-16206970 ] Jerry He commented on HBASE-10367: -- [~stack] are you good with the patch? Any more comments? Comments from others? > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203052#comment-16203052 ] Hadoop QA commented on HBASE-10367: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 3s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 46s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} rubocop {color} | {color:green} 0m 3s{color} | {color:green} There were no new rubocop issues. {color} | | {color:green}+1{color} | {color:green} ruby-lint {color} | {color:green} 0m 1s{color} | {color:green} There were no new ruby-lint issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 37m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 92m 19s{color} |
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203034#comment-16203034 ] Hadoop QA commented on HBASE-10367: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 13s{color} | {color:red} Docker failed to build yetus/hbase:5d60123. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-10367 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891968/HBASE-10367-master-2.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/9090/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-10367-master-2.patch, HBASE-10367-master.patch, > HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202755#comment-16202755 ] stack commented on HBASE-10367: --- Yes. You can change unreleased APIs [~jerryhe]. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-10367-master.patch, HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202741#comment-16202741 ] Hadoop QA commented on HBASE-10367: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 28s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 5s{color} | {color:red} The patch generated 1 new + 40 unchanged - 0 fixed = 41 total (was 40) {color} | | {color:green}+1{color} | {color:green} ruby-lint {color} | {color:green} 0m 1s{color} | {color:green} There were no new ruby-lint issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 44s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 42m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 3m 25s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 43s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 30s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 42s{color} | {color:red} root generated 1 new + 25 unchanged - 0 fixed = 26 total (was 25) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color}
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202369#comment-16202369 ] Jerry He commented on HBASE-10367: -- These Admin APIs are only in 2.0 alpha's releases so far. We are still ok to change them, I assume? Good point on passing servers being decommissioned to the coprocessor. I did not thought about it. It sounds reasonable. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-10367-master.patch, HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202231#comment-16202231 ] stack commented on HBASE-10367: --- Patch is looking nice [~jerryhe] You rename Admin methods w/o deprecation, what you thinking? Has the API not shipped in a release? Do you not pass the names of the servers being decommissioned to the coprocessor for a reason? I'd think the CP would be interested in which server is being changed. Otherwise, looking good. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-10367-master.patch > > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201645#comment-16201645 ] Hadoop QA commented on HBASE-10367: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 22s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 6s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 3s{color} | {color:red} The patch generated 1 new + 40 unchanged - 0 fixed = 41 total (was 40) {color} | | {color:green}+1{color} | {color:green} ruby-lint {color} | {color:green} 0m 1s{color} | {color:green} There were no new ruby-lint issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 21s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 41m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 3m 20s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 33s{color} | {color:red} root generated 1 new + 25 unchanged - 0 fixed = 26 total (was 25) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color}
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201377#comment-16201377 ] Jerry He commented on HBASE-10367: -- Continue the work from HBASE-16010. See related comments from that issue. Attached a patch to do 'decommission'. > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar >Assignee: Jerry He > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784771#comment-15784771 ] Lars George commented on HBASE-10367: - The state has not changed and I have users I talk to struggle with this mess way too often. We need to have a clear story on this. Voting up! > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13874165#comment-13874165 ] Nick Dimiduk commented on HBASE-10367: -- The usability of graceful stop just came up in a conversation I had today as well. Thanks for bringing it up, @enis! RegionServer graceful stop / decommissioning Key: HBASE-10367 URL: https://issues.apache.org/jira/browse/HBASE-10367 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Right now, we have a weird way of node decommissioning / graceful stop, which is a graceful_stop.sh bash script, and a region_mover ruby script, and some draining server support which you have to manually write to a znode (really!). Also draining servers is only partially supported in LB operations (LB does take that into account for roundRobin assignment, but not for normal balance) See http://hbase.apache.org/book/node.management.html and HBASE-3071 I think we should support graceful stop as a first class citizen. Thinking about it, it seems that the difference between regionserver stop and graceful stop is that regionserver stop will close the regions, but the master will only assign them after the znode is deleted. In the new master design (or even before), if we allow RS to be able to close regions on its own (without master initiating it), then graceful stop becomes regular stop. The RS already closes the regions cleanly, and will reject new region assignments, so that we don't need much of the balancer or draining server trickery. This ties into the new master/AM redesign (HBASE-5487), but still deserves it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.1.5#6160)