[ 
https://issues.apache.org/jira/browse/HBASE-17570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-17570.
---------------------------
    Resolution: Duplicate

Fixed by HBASE-17350

> rsgroup server move can get stuck if unassigning fails
> ------------------------------------------------------
>
>                 Key: HBASE-17570
>                 URL: https://issues.apache.org/jira/browse/HBASE-17570
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: stack
>             Fix For: 2.0.0
>
>
> This is pretty easy to repro in a standalone setup on master branch. Master 
> branch has the 'fake' Master regionserver. It is showing as a regionserver in 
> the rsgroup 'default' group. If I create a new group and then try moving 
> servers to the new group, it will usually get stuck in the below loop... and 
> it will never break out (have to kill master).
> Looking at code, the RSGroupAdminServer#moveServers has a loop in it that 
> will just go on for ever; there is no timeout nor maximum tries.
> Maybe we don't see this much in a 'real' cluster. Filing this issue in 
> meantime because needs to not keep trying for ever and fail the move.
> {code}
> 2017-01-30 21:34:46,340 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> rsgroup.RSGroupAdminServer: Unassigning 1 regions from server localhost:50143 
> for move to xx
> 2017-01-30 21:34:46,341 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStates: Transition {8ebaa5bd7a2e906429a7b91bb2bee333 state=OPEN, 
> ts=1485840806167, server=localhost,50143,1485840800161} to 
> {8ebaa5bd7a2e906429a7b91bb2bee333 state=PENDING_CLOSE, ts=1485840886341, 
> server=localhost,50143,1485840800161}
> 2017-01-30 21:34:46,341 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStateStore: Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=PENDING_CLOSE
> 2017-01-30 21:34:46,347 INFO  
> [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=50143] 
> regionserver.RSRpcServices: Close 8ebaa5bd7a2e906429a7b91bb2bee333 without 
> moving
> 2017-01-30 21:34:46,348 INFO  [RS_CLOSE_REGION-localhost:50143-0] 
> regionserver.HRegion: Flushing 1/1 column families, memstore=431 B
> 2017-01-30 21:34:46,406 INFO  [RS_CLOSE_REGION-localhost:50143-0] 
> regionserver.DefaultStoreFlusher: Flushed, sequenceid=7, memsize=431, 
> hasBloomFilter=true, into tmp file 
> file:/var/folders/d8/8lyxycpd129d4fj7lb684dwh0000gp/T/hbase-stack/hbase/data/hbase/rsgroup/8ebaa5bd7a2e906429a7b91bb2bee333/.tmp/m/999d93adf36b4406bb73dc64e0158a05
> 2017-01-30 21:34:46,422 INFO  [RS_CLOSE_REGION-localhost:50143-0] 
> regionserver.HStore: Added 
> file:/var/folders/d8/8lyxycpd129d4fj7lb684dwh0000gp/T/hbase-stack/hbase/data/hbase/rsgroup/8ebaa5bd7a2e906429a7b91bb2bee333/m/999d93adf36b4406bb73dc64e0158a05,
>  entries=2, sequenceid=7, filesize=4.9 K
> 2017-01-30 21:34:46,422 INFO  [RS_CLOSE_REGION-localhost:50143-0] 
> regionserver.HRegion: Finished memstore flush of ~431 B/431, currentsize=0 
> B/0 for region hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. 
> in 74ms, sequenceid=7, compaction requested=false
> 2017-01-30 21:34:46,425 INFO  
> [StoreCloserThread-hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333.-1]
>  regionserver.HStore: Closed m
> 2017-01-30 21:34:46,437 INFO  [RS_CLOSE_REGION-localhost:50143-0] 
> regionserver.HRegion: Closed 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333.
> 2017-01-30 21:34:46,440 INFO  
> [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=50141] 
> master.RegionStates: Transition {8ebaa5bd7a2e906429a7b91bb2bee333 
> state=PENDING_CLOSE, ts=1485840886341, server=localhost,50143,1485840800161} 
> to {8ebaa5bd7a2e906429a7b91bb2bee333 state=CLOSED, ts=1485840886440, 
> server=localhost,50143,1485840800161}
> 2017-01-30 21:34:46,440 INFO  
> [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=50141] 
> master.RegionStateStore: Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=CLOSED
> 2017-01-30 21:34:46,442 WARN  [AM.-pool3-t1] balancer.BaseLoadBalancer: 
> Wanted to do retain assignment but no servers to assign to
> 2017-01-30 21:34:46,442 WARN  [AM.-pool3-t1] master.AssignmentManager: Can't 
> find a destination for 8ebaa5bd7a2e906429a7b91bb2bee333
> 2017-01-30 21:34:46,442 WARN  [AM.-pool3-t1] master.AssignmentManager: Unable 
> to determine a plan to assign {ENCODED => 8ebaa5bd7a2e906429a7b91bb2bee333, 
> NAME => 'hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333.', 
> STARTKEY => '', ENDKEY => ''}
> 2017-01-30 21:34:46,442 WARN  [AM.-pool3-t1] master.RegionStates: Failed to 
> open/close 8ebaa5bd7a2e906429a7b91bb2bee333 on localhost,50143,1485840800161, 
> set to FAILED_OPEN
> 2017-01-30 21:34:46,442 INFO  [AM.-pool3-t1] master.RegionStates: Transition 
> {8ebaa5bd7a2e906429a7b91bb2bee333 state=CLOSED, ts=1485840886440, 
> server=localhost,50143,1485840800161} to {8ebaa5bd7a2e906429a7b91bb2bee333 
> state=FAILED_OPEN, ts=1485840886442, server=localhost,50143,1485840800161}
> 2017-01-30 21:34:46,442 INFO  [AM.-pool3-t1] master.RegionStateStore: 
> Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=FAILED_OPEN
> 2017-01-30 21:34:46,990 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
> server.NIOServerCnxnFactory: Accepted socket connection from 
> /0:0:0:0:0:0:0:1:50272
> 2017-01-30 21:34:46,990 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
> server.ZooKeeperServer: Refusing session request for client 
> /0:0:0:0:0:0:0:1:50272 as it has seen zxid 0x25e our last zxid is 0xae client 
> must try another server
> 2017-01-30 21:34:46,990 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
> server.NIOServerCnxn: Closed socket connection for client 
> /0:0:0:0:0:0:0:1:50272 (no session established for client)
> 2017-01-30 21:34:47,353 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> rsgroup.RSGroupAdminServer: Unassigning 2 regions from server localhost:50143 
> for move to xx
> 2017-01-30 21:34:47,353 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStates: Transition {8ebaa5bd7a2e906429a7b91bb2bee333 
> state=FAILED_OPEN, ts=1485840886442, server=localhost,50143,1485840800161} to 
> {8ebaa5bd7a2e906429a7b91bb2bee333 state=OFFLINE, ts=1485840887353, 
> server=localhost,50143,1485840800161}
> 2017-01-30 21:34:47,353 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStateStore: Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=OFFLINE
> 2017-01-30 21:34:47,355 WARN  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> balancer.BaseLoadBalancer: Wanted to do retain assignment but no servers to 
> assign to
> 2017-01-30 21:34:47,355 WARN  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.AssignmentManager: Can't find a destination for 
> 8ebaa5bd7a2e906429a7b91bb2bee333
> 2017-01-30 21:34:47,355 WARN  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.AssignmentManager: Unable to determine a plan to assign {ENCODED => 
> 8ebaa5bd7a2e906429a7b91bb2bee333, NAME => 
> 'hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333.', STARTKEY => 
> '', ENDKEY => ''}
> 2017-01-30 21:34:47,355 WARN  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStates: Failed to open/close 8ebaa5bd7a2e906429a7b91bb2bee333 on 
> localhost,50143,1485840800161, set to FAILED_OPEN
> 2017-01-30 21:34:47,355 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStates: Transition {8ebaa5bd7a2e906429a7b91bb2bee333 
> state=OFFLINE, ts=1485840887353, server=localhost,50143,1485840800161} to 
> {8ebaa5bd7a2e906429a7b91bb2bee333 state=FAILED_OPEN, ts=1485840887355, 
> server=localhost,50143,1485840800161}
> 2017-01-30 21:34:47,355 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStateStore: Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=FAILED_OPEN
> 2017-01-30 21:34:47,356 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStates: Transition {8ebaa5bd7a2e906429a7b91bb2bee333 
> state=FAILED_OPEN, ts=1485840887355, server=localhost,50143,1485840800161} to 
> {8ebaa5bd7a2e906429a7b91bb2bee333 state=OFFLINE, ts=1485840887356, 
> server=localhost,50143,1485840800161}
> 2017-01-30 21:34:47,356 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=29,queue=2,port=50141] 
> master.RegionStateStore: Updating hbase:meta row 
> hbase:rsgroup,,1485840805941.8ebaa5bd7a2e906429a7b91bb2bee333. with 
> state=OFFLINE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to