[jira] [Commented] (HBASE-17306) IntegrationTestRSGroup#testRegionMove may fail due to region server not online

Francis Liu (JIRA) Tue, 20 Dec 2016 19:07:34 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765953#comment-15765953
 ]


Francis Liu commented on HBASE-17306:
-------------------------------------

{quote}
Francis Liu:
Can you give us some background on the above requirement ?
{quote}
You can't move a regionserver that's not online in default group since 
membership in default group is dynamic (all online regionservers that are not 
members of any other group) there is no way to determine if a offline RS being 
move is a valid RS or not which would just lead to more problems.

In any case it seems the problem here is more about stabilizing the test 
itself. ie Avoiding the race of moving an RS that is still not online. 

> IntegrationTestRSGroup#testRegionMove may fail due to region server not online
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-17306
>                 URL: https://issues.apache.org/jira/browse/HBASE-17306
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Priority: Minor
>         Attachments: 17306.v1.txt
>
>
> {code}
> 2016-12-13 05:26:57,965|INFO|MainThread|machine.py:145 - run()|2) 
> testRegionMove(org.apache.hadoop.hbase.rsgroup.IntegrationTestRSGroup)
> 2016-12-13 05:26:57,965|INFO|MainThread|machine.py:145 - 
> run()|org.apache.hadoop.hbase.constraint.ConstraintException: 
> org.apache.hadoop.hbase.constraint.                    ConstraintException: 
> Server ctr-e77-1481596162056-0240-01-000005.a.com:16020 is not an online 
> server in default group.
> 2016-12-13 05:26:57,966|INFO|MainThread|machine.py:145 - run()|at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveServers(RSGroupAdminServer.java:135)
> 2016-12-13 05:26:57,966|INFO|MainThread|machine.py:145 - run()|at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.moveServers(RSGroupAdminEndpoint.java:169)
> 2016-12-13 05:26:57,966|INFO|MainThread|machine.py:145 - run()|at 
> org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.
>                           callMethod(RSGroupAdminProtos.java:11136)
> 2016-12-13 05:26:57,966|INFO|MainThread|machine.py:145 - run()|at 
> org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:679)
> 2016-12-13 05:26:57,966|INFO|MainThread|machine.py:145 - run()|at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2
> {code}
> Shortly before the test failure, the server was shutdown:
> {code}
> 2016-12-13 05:21:25,428 INFO  
> [MASTER_SERVER_OPERATIONS-ctr-e77-1481596162056-0240-01-000008:20000-4] 
> handler.ServerShutdownHandler: Finished processing of shutdown of ctr-  
> e77-1481596162056-0240-01-000005.a.com,16020,1481606309159
> ...
> 2016-12-13 05:26:57,935 INFO  
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=20000] 
> master.ServerManager: Registering 
> server=ctr-e77-1481596162056-0240-01-000005.hwx. site,16020,1481606803303
> 2016-12-13 05:27:06,219 DEBUG [main-EventThread] 
> zookeeper.RegionServerTracker: Added tracking of RS 
> /hbase-secure/rs/ctr-e77-1481596162056-0240-01-000005.a.com,16020,       
> 1481606803303
> {code}
> The registration of the new server (start code1481606803303) happened shortly 
> after the test failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17306) IntegrationTestRSGroup#testRegionMove may fail due to region server not online

Reply via email to