[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-08 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r301347251
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +143,84 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroup() throws Exception {
+// create a rsgroup and move one regionserver to it
+String groupName = "my_group";
+int groupRSCount = 2;
+addGroup(groupName, groupRSCount);
+
+// create a table, and move it to my_group
+Table t = TEST_UTIL.createMultiRegionTable(tableName, Bytes.toBytes("f"), 
5);
+TEST_UTIL.loadTable(t, Bytes.toBytes("f"));
+Set toAddTables = new HashSet<>();
+toAddTables.add(tableName);
+rsGroupAdmin.moveTables(toAddTables, groupName);
+
assertTrue(rsGroupAdmin.getRSGroupInfo(groupName).getTables().contains(tableName));
+TEST_UTIL.waitTableAvailable(tableName, 3);
+
+// check my_group servers and table regions
+Set servers = rsGroupAdmin.getRSGroupInfo(groupName).getServers();
+assertEquals(2, servers.size());
+LOG.debug("group servers {}", servers);
+for (RegionInfo tr :
+
master.getAssignmentManager().getRegionStates().getRegionsOfTable(tableName)) {
+  assertTrue(servers.contains(
+  
master.getAssignmentManager().getRegionStates().getRegionAssignments()
+  .get(tr).getAddress()));
+}
+
+// move all table regions on one group server to another
+// these codes are aimed to make 'lastHost' in my_group
+// and check if table regions are online
+List gsn = new ArrayList<>();
+for(Address addr : servers){
+  gsn.add(getServerName(addr));
+}
+assertEquals(2, gsn.size());
+for(Map.Entry entry :
+
master.getAssignmentManager().getRegionStates().getRegionAssignments().entrySet()){
+  if(entry.getKey().getTable().equals(tableName)){
+LOG.debug("move region {}", entry.getKey().getRegionNameAsString());
+TEST_UTIL.moveRegionAndWait(entry.getKey(), gsn.get(1 - 
gsn.indexOf(entry.getValue(;
+  }
+}
+TEST_UTIL.waitTableAvailable(tableName, 3);
+
+// case 1: stop all the regionservers in my_group, and restart a 
regionserver in my_group,
+// and then check if all table regions are online
+for(Address addr : rsGroupAdmin.getRSGroupInfo(groupName).getServers()) {
+  TEST_UTIL.getMiniHBaseCluster().stopRegionServer(getServerName(addr));
+}
+// better wait for a while for region reassign
+sleep(1);
+assertEquals(NUM_SLAVES_BASE - gsn.size(),
+TEST_UTIL.getMiniHBaseCluster().getLiveRegionServerThreads().size());
+TEST_UTIL.getMiniHBaseCluster().startRegionServer(gsn.get(0).getHostname(),
+gsn.get(0).getPort());
+assertEquals(NUM_SLAVES_BASE - gsn.size() + 1,
+TEST_UTIL.getMiniHBaseCluster().getLiveRegionServerThreads().size());
+TEST_UTIL.waitTableAvailable(tableName, 3);
+
+// case 2: stop all the regionservers in my_group, and move another
+// regionserver(in 'default' group) to my_group, and then check if all 
table regions are online
 
 Review comment:
   // regionserver(from the 'default' group)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-08 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r301347171
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +143,84 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroup() throws Exception {
+// create a rsgroup and move one regionserver to it
+String groupName = "my_group";
+int groupRSCount = 2;
+addGroup(groupName, groupRSCount);
+
+// create a table, and move it to my_group
+Table t = TEST_UTIL.createMultiRegionTable(tableName, Bytes.toBytes("f"), 
5);
+TEST_UTIL.loadTable(t, Bytes.toBytes("f"));
+Set toAddTables = new HashSet<>();
+toAddTables.add(tableName);
+rsGroupAdmin.moveTables(toAddTables, groupName);
+
assertTrue(rsGroupAdmin.getRSGroupInfo(groupName).getTables().contains(tableName));
+TEST_UTIL.waitTableAvailable(tableName, 3);
+
+// check my_group servers and table regions
+Set servers = rsGroupAdmin.getRSGroupInfo(groupName).getServers();
+assertEquals(2, servers.size());
+LOG.debug("group servers {}", servers);
+for (RegionInfo tr :
+
master.getAssignmentManager().getRegionStates().getRegionsOfTable(tableName)) {
+  assertTrue(servers.contains(
+  
master.getAssignmentManager().getRegionStates().getRegionAssignments()
+  .get(tr).getAddress()));
+}
+
+// move all table regions on one group server to another
+// these codes are aimed to make 'lastHost' in my_group
+// and check if table regions are online
+List gsn = new ArrayList<>();
+for(Address addr : servers){
+  gsn.add(getServerName(addr));
+}
+assertEquals(2, gsn.size());
+for(Map.Entry entry :
+
master.getAssignmentManager().getRegionStates().getRegionAssignments().entrySet()){
+  if(entry.getKey().getTable().equals(tableName)){
+LOG.debug("move region {}", entry.getKey().getRegionNameAsString());
 
 Review comment:
   LOG.debug("move region {} from {} to {}", 
entry.getKey().getRegionNameAsString(), fromServer, toServer);


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-08 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r301347039
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +143,84 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroup() throws Exception {
+// create a rsgroup and move one regionserver to it
+String groupName = "my_group";
+int groupRSCount = 2;
+addGroup(groupName, groupRSCount);
+
+// create a table, and move it to my_group
+Table t = TEST_UTIL.createMultiRegionTable(tableName, Bytes.toBytes("f"), 
5);
+TEST_UTIL.loadTable(t, Bytes.toBytes("f"));
+Set toAddTables = new HashSet<>();
+toAddTables.add(tableName);
+rsGroupAdmin.moveTables(toAddTables, groupName);
+
assertTrue(rsGroupAdmin.getRSGroupInfo(groupName).getTables().contains(tableName));
+TEST_UTIL.waitTableAvailable(tableName, 3);
+
+// check my_group servers and table regions
+Set servers = rsGroupAdmin.getRSGroupInfo(groupName).getServers();
+assertEquals(2, servers.size());
+LOG.debug("group servers {}", servers);
+for (RegionInfo tr :
+
master.getAssignmentManager().getRegionStates().getRegionsOfTable(tableName)) {
+  assertTrue(servers.contains(
+  
master.getAssignmentManager().getRegionStates().getRegionAssignments()
+  .get(tr).getAddress()));
+}
+
+// move all table regions on one group server to another
 
 Review comment:
   //Swap the region locations (i.e. rs1 regions to rs2 & vice versa)
   // these codes are ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-08 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r301346868
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +143,84 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroup() throws Exception {
+// create a rsgroup and move one regionserver to it
 
 Review comment:
   *two regionservers ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-03 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r300192578
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +139,68 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroupAndThenAddNew() throws Exception {
+// create a rsgroup and move one regionserver to it
+String groupName = "my_group";
+int groupRSCount = 1;
+RSGroupInfo rsGroupInfo = addGroup(groupName, groupRSCount);
+
+// create a multi-region table, and move it to my_group
+TEST_UTIL.createMultiRegionTable(tableName, Bytes.toBytes("f"), 5);
+Set toAddTables = new HashSet<>();
+toAddTables.add(tableName);
+rsGroupAdmin.moveTables(toAddTables, groupName);
+
assertTrue(rsGroupAdmin.getRSGroupInfo(groupName).getTables().contains(tableName));
+
+// check my_group servers and if regions are online
+Set servers = rsGroupInfo.getServers();
+ServerName myGroupRS = null;
+for (int i = 0; i < NUM_SLAVES_BASE; ++i) {
+  ServerName sn = 
TEST_UTIL.getMiniHBaseCluster().getRegionServer(i).getServerName();
+  if (servers.contains(sn.getAddress())) {
+myGroupRS = sn;
+break;
+  }
+}
+assertNotNull(myGroupRS);
+checkRegionsOnline(tableName, true);
+
+// stop the regionserver in my_group, and table regions will be offline
+TEST_UTIL.getMiniHBaseCluster().stopRegionServer(myGroupRS);
+// better wait for a while for region reassign
+sleep(1);
+
assertEquals(TEST_UTIL.getMiniHBaseCluster().getLiveRegionServerThreads().size(),
+NUM_SLAVES_BASE - servers.size());
+checkRegionsOnline(tableName, false);
+
+// move another regionserver to the my_group
+// in this case, moving another region server can be replaced by 
restarting the regionserver
+// mentioned before
+RSGroupInfo defaultInfo = 
rsGroupAdmin.getRSGroupInfo(RSGroupInfo.DEFAULT_GROUP);
+Set set = new HashSet<>();
+for (Address server : defaultInfo.getServers()) {
+  if (set.size() == groupRSCount) {
+break;
+  }
+  set.add(server);
+}
+rsGroupAdmin.moveServers(set, groupName);
 
 Review comment:
   @infraio This is the line where the following exception is thrown when run 
without the changes:
   
   > Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.constraint.ConstraintException):
 org.apache.hadoop.hbase.constraint.ConstraintException: Target RSGroup 
my_group is same as source Name:my_group,  Servers:[192.168.0.69:62102, 
192.168.0.69:62105],  Tables:[Group_testKillAllRSInGroupAndThenAddNew] RSGroup.
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveServers(RSGroupAdminServer.java:298)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveServers(RSGroupAdminEndpoint.java:218)
at 
org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13870)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:889)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:374)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck when a rsgroup has no online servers but AM…

2019-07-03 Thread GitBox
jatsakthi commented on a change in pull request #354: HBASE-20368 Fix RIT stuck 
when a rsgroup has no online servers but AM…
URL: https://github.com/apache/hbase/pull/354#discussion_r300192578
 
 

 ##
 File path: 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsKillRS.java
 ##
 @@ -131,7 +139,68 @@ public boolean evaluate() throws Exception {
 });
 
 ServerName targetServer1 = getServerName(newServers.iterator().next());
-Assert.assertEquals(1, admin.getRegions(targetServer1).size());
-Assert.assertEquals(tableName, 
admin.getRegions(targetServer1).get(0).getTable());
+assertEquals(1, admin.getRegions(targetServer1).size());
+assertEquals(tableName, admin.getRegions(targetServer1).get(0).getTable());
+  }
+
+  @Test
+  public void testKillAllRSInGroupAndThenAddNew() throws Exception {
+// create a rsgroup and move one regionserver to it
+String groupName = "my_group";
+int groupRSCount = 1;
+RSGroupInfo rsGroupInfo = addGroup(groupName, groupRSCount);
+
+// create a multi-region table, and move it to my_group
+TEST_UTIL.createMultiRegionTable(tableName, Bytes.toBytes("f"), 5);
+Set toAddTables = new HashSet<>();
+toAddTables.add(tableName);
+rsGroupAdmin.moveTables(toAddTables, groupName);
+
assertTrue(rsGroupAdmin.getRSGroupInfo(groupName).getTables().contains(tableName));
+
+// check my_group servers and if regions are online
+Set servers = rsGroupInfo.getServers();
+ServerName myGroupRS = null;
+for (int i = 0; i < NUM_SLAVES_BASE; ++i) {
+  ServerName sn = 
TEST_UTIL.getMiniHBaseCluster().getRegionServer(i).getServerName();
+  if (servers.contains(sn.getAddress())) {
+myGroupRS = sn;
+break;
+  }
+}
+assertNotNull(myGroupRS);
+checkRegionsOnline(tableName, true);
+
+// stop the regionserver in my_group, and table regions will be offline
+TEST_UTIL.getMiniHBaseCluster().stopRegionServer(myGroupRS);
+// better wait for a while for region reassign
+sleep(1);
+
assertEquals(TEST_UTIL.getMiniHBaseCluster().getLiveRegionServerThreads().size(),
+NUM_SLAVES_BASE - servers.size());
+checkRegionsOnline(tableName, false);
+
+// move another regionserver to the my_group
+// in this case, moving another region server can be replaced by 
restarting the regionserver
+// mentioned before
+RSGroupInfo defaultInfo = 
rsGroupAdmin.getRSGroupInfo(RSGroupInfo.DEFAULT_GROUP);
+Set set = new HashSet<>();
+for (Address server : defaultInfo.getServers()) {
+  if (set.size() == groupRSCount) {
+break;
+  }
+  set.add(server);
+}
+rsGroupAdmin.moveServers(set, groupName);
 
 Review comment:
   @infraio This is the line where the following exception is thrown when run 
without the changes:
   
   `Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.constraint.ConstraintException):
 org.apache.hadoop.hbase.constraint.ConstraintException: Target RSGroup 
my_group is same as source Name:my_group,  Servers:[192.168.0.69:62102, 
192.168.0.69:62105],  Tables:[Group_testKillAllRSInGroupAndThenAddNew] RSGroup.
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveServers(RSGroupAdminServer.java:298)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveServers(RSGroupAdminEndpoint.java:218)
at 
org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13870)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:889)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:374)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services