[
https://issues.apache.org/jira/browse/HBASE-10852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949949#comment-13949949
]
Nick Dimiduk commented on HBASE-10852:
--------------------------------------
Simple enough, +1. Is 2 retries enough to weather a RIT? Make it 5?
> TestDistributedLogSplitting#testDisallowWritesInRecovering occasionally fails
> -----------------------------------------------------------------------------
>
> Key: HBASE-10852
> URL: https://issues.apache.org/jira/browse/HBASE-10852
> Project: HBase
> Issue Type: Test
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Minor
> Fix For: 0.99.0
>
> Attachments: 10852-v1.txt
>
>
> Here was the failure:
> {code}
> java.lang.AssertionError: No RegionInRecoveryException. Following exceptions
> returned=[org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is
> not online on c64-s12.cs1cloud.internal,52861,1395905929889
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2676)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4095)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2826)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28857)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
> at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
> at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
> at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
> at java.lang.Thread.run(Thread.java:722)
> ]
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testDisallowWritesInRecovering(TestDistributedLogSplitting.java:924)
> {code}
> Here was the cause:
> {code}
> 2014-03-27 00:39:01,398 DEBUG [RS_OPEN_META-c64-s12:44281-0]
> handler.OpenRegionHandler(179): Opened hbase:meta,,1.1588230740 on
> c64-s12.cs1cloud.internal,44281,1395905929927
> 2014-03-27 00:39:01,405 DEBUG [Thread-2811-EventThread]
> zookeeper.ZooKeeperWatcher(310): master:32796-0x1450278b68f01cc,
> quorum=localhost:50923, baseZNode=/hbase Received ZooKeeper Event,
> type=NodeDeleted, state=SyncConnected,
> path=/hbase/region-in-transition/1588230740
> 2014-03-27 00:39:01,405 DEBUG [Thread-2811-EventThread]
> zookeeper.ZooKeeperWatcher(310): master:32796-0x1450278b68f01cc,
> quorum=localhost:50923, baseZNode=/hbase Received ZooKeeper Event,
> type=NodeChildrenChanged, state=SyncConnected,
> path=/hbase/region-in-transition
> 2014-03-27 00:39:01,406 DEBUG [AM.ZK.Worker-pool1213-t19]
> zookeeper.ZKAssign(480): master:32796-0x1450278b68f01cc,
> quorum=localhost:50923, baseZNode=/hbase Deleted unassigned node 1588230740
> in expected state RS_ZK_REGION_OPENED
> 2014-03-27 00:39:01,406 DEBUG [AM.ZK.Worker-pool1213-t19]
> master.AssignmentManager$4(1186): Znode hbase:meta,,1.1588230740 deleted,
> state: {1588230740 state=OPEN, ts=1395905941397,
> server=c64-s12.cs1cloud.internal,44281,1395905929927}
> 2014-03-27 00:39:01,406 INFO [AM.ZK.Worker-pool1213-t19]
> master.RegionStates(413): Onlined 1588230740 on
> c64-s12.cs1cloud.internal,44281,1395905929927
> 2014-03-27 00:39:01,406 INFO [AM.ZK.Worker-pool1213-t19]
> master.RegionStates(417): Offlined 1588230740 from
> c64-s12.cs1cloud.internal,52861,1395905929889
> 2014-03-27 00:39:01,547 WARN [Thread-2811]
> client.ConnectionManager$HConnectionImplementation(1221): Encountered
> problems when prefetch hbase:meta table:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=1, exceptions:
> Thu Mar 27 00:39:01 PDT 2014,
> org.apache.hadoop.hbase.client.RpcRetryingCaller@23136717,
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is
> not online on c64-s12.cs1cloud.internal,52861,1395905929889
> {code}
> hbase:meta was moving but client didn't retry (attempts=1).
> Thanks to Jeff who helped identify the issue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)