[jira] [Resolved] (HBASE-11282) Load balancer may move a region which is participating in snapshot

Ted Yu (JIRA) Tue, 03 Jun 2014 17:41:31 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ted Yu resolved HBASE-11282.
----------------------------

    Resolution: Later

> Load balancer may move a region which is participating in snapshot
> ------------------------------------------------------------------
>
>                 Key: HBASE-11282
>                 URL: https://issues.apache.org/jira/browse/HBASE-11282
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> The region was tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.
> From master log:
> {code}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] 
> master.AssignmentManager: Found an existing plan for 
> tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.       destination 
> server is 
> h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal,60020,1394494963812 
> accepted as a dest server = true
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] 
> master.AssignmentManager: Using pre-existing plan for 
> tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.;     
> plan=hri=tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7., 
> src=h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165, 
> dest=h2-ubuntu12-sec-     
> 1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> 2014-03-10 23:48:09,035 INFO  [AM.ZK.Worker-pool2-t42] master.RegionStates: 
> Transitioned {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=CLOSED, 
> ts=1394495289035, server=h2-       
> ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165} to 
> {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=OFFLINE, ts=1394495289035, 
> server=h2-ubuntu12-sec-        
> 1394425849-hbase-9.cs1cloud.internal,60020,1394494962165}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] zookeeper.ZKAssign: 
> master:60000-0x244aa9920190b04, 
> quorum=h2-ubuntu12-sec-1394425849-hbase-8.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-1.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal:2181,
>  baseZNode=/hbase Creating (or updating) unassigned     node 
> 289ebdee6adf0a3b9c2bbcbe2ff522e7 with OFFLINE state
> 2014-03-10 23:48:09,044 INFO  [AM.ZK.Worker-pool2-t42] 
> master.AssignmentManager: Assigning 
> tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. to h2-ubuntu12-sec- 
>    1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> {code}
> From hbase-hbase-regionserver-h2-ubuntu12-sec-1394425849-hbase-9.log :
> {code}
> 2014-03-10 23:48:08,487 WARN  [member: 
> 'h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165' 
> subprocedure-pool1-thread-1] snapshot.                    
> RegionServerSnapshotManager: Got Exception in SnapshotSubprocedurePool
> java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hbase.NotServingRegionException: 
> tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:325)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
>   at 
> org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
>   at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
> tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5699)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5663)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:65)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> Load balancer's move of the underlying region caused 
> FlushSnapshotSubprocedure to fail.
> Mechanism of making load balancer be aware of region operation is desirable 
> such that snapshot doesn't fail due to the above scenario.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HBASE-11282) Load balancer may move a region which is participating in snapshot

Reply via email to