[
https://issues.apache.org/jira/browse/HBASE-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587100#comment-14587100
]
Mikhail Antonov commented on HBASE-13904:
-----------------------------------------
Here's relevant piece (this thread doesn't fail):
{code}
2015-06-15 15:58:21,291 WARN [RunAmJoinCluster] master.RegionStates(339):
Tried to create a state for a region already in RegionStates, used existing:
{e87bbed599bed2950d75b6f7528c0349 state=FAILED_CLOSE, ts=1434409101284,
server=example.org,1234,5678}, ignored new: OPEN
2015-06-15 15:58:21,292 DEBUG [RunAmJoinCluster] master.AssignmentManager(540):
Found {ENCODED => e87bbed599bed2950d75b6f7528c0349, NAME =>
't,,1434409036078.e87bbed599bed2950d75b6f7528c0349.', STARTKEY => '', ENDKEY =>
''} out on cluster
2015-06-15 15:58:21,292 INFO [RunAmJoinCluster] master.AssignmentManager(623):
Found regions out on cluster or in RIT; presuming failover
2015-06-15 15:58:21,293 INFO [RunAmJoinCluster] master.AssignmentManager(767):
Processing e87bbed599bed2950d75b6f7528c0349 in state: M_ZK_REGION_CLOSING
2015-06-15 15:58:21,293 INFO [RunAmJoinCluster] master.RegionStates(1112):
Transition {e87bbed599bed2950d75b6f7528c0349 state=FAILED_CLOSE,
ts=1434409101284, server=example.org,1234,5678} to
{e87bbed599bed2950d75b6f7528c0349 state=CLOSING, ts=1434409101293,
server=example.org,1234,5678}
2015-06-15 15:58:21,294 INFO [Thread-121] master.RegionStates(1112):
Transition {e87bbed599bed2950d75b6f7528c0349 state=FAILED_CLOSE,
ts=1434409101284, server=example.org,1234,5678} to
{e87bbed599bed2950d75b6f7528c0349 state=PENDING_CLOSE, ts=1434409101294,
server=example.org,1234,5678}
2015-06-15 15:58:21,294 INFO [RunAmJoinCluster] master.AssignmentManager(898):
Processed region e87bbed599bed2950d75b6f7528c0349 in state M_ZK_REGION_CLOSING,
on server: example.org,1234,5678
2015-06-15 15:58:21,294 DEBUG [Thread-121] zookeeper.ZKAssign(805):
mockedServer-0x14df970ebdc0018, quorum=localhost:61627, baseZNode=/hbase
Transitioning e87bbed599bed2950d75b6f7528c0349 from M_ZK_REGION_CLOSING to
RS_ZK_REGION_CLOSED
2015-06-15 15:58:21,295 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,295 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,296 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,296 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,296 DEBUG [main-EventThread]
zookeeper.ZooKeeperWatcher(508): mockedServer-0x14df970ebdc0018,
quorum=localhost:61627, baseZNode=/hbase Received ZooKeeper Event,
type=NodeDataChanged, state=SyncConnected,
path=/hbase/region-in-transition/e87bbed599bed2950d75b6f7528c0349
2015-06-15 15:58:21,297 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,296 DEBUG [Thread-121] zookeeper.ZKAssign(880):
mockedServer-0x14df970ebdc0018, quorum=localhost:61627, baseZNode=/hbase
Transitioned node e87bbed599bed2950d75b6f7528c0349 from M_ZK_REGION_CLOSING to
RS_ZK_REGION_CLOSED
2015-06-15 15:58:21,297 WARN [MASTER_SERVER_OPERATIONS-mockedAMExecutor-0]
master.AssignmentManager(1866): Server example.org,1234,5678 region CLOSE RPC
returned false for t,,1434409036078.e87bbed599bed2950d75b6f7528c0349.
2015-06-15 15:58:21,297 INFO [RunAmJoinCluster] master.AssignmentManager(499):
Joined the cluster in 12ms, failover=true
{code}
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode
> failing consistently on branch-1.1
> -----------------------------------------------------------------------------------------------------------
>
> Key: HBASE-13904
> URL: https://issues.apache.org/jira/browse/HBASE-13904
> Project: HBase
> Issue Type: Bug
> Components: master, Region Assignment, test
> Affects Versions: 1.1.1
> Reporter: Nick Dimiduk
> Assignee: Mikhail Antonov
> Priority: Critical
> Fix For: 1.1.1
>
> Attachments: HBASE-13904-mantonov_running_whole_class.txt,
> org.apache.hadoop.hbase.master.TestAssignmentManager-output.txt
>
>
> {noformat}
> $ JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79.x86_64
> ../apache-maven-3.3.3/bin/mvn -PrunAllTests -DreuseForks=false clean install
> -Dmaven.test.redirectTestOutputToFile=true
> -Dsurefire.rerunFailingTestsCount=4 -Dit.test=noItTest
> ...
> Tests in error:
> org.apache.hadoop.hbase.master.TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode(org.apache.hadoop.hbase.master.TestAssignmentManager)
> Run 1:
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:368 »
> Run 2:
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 »
> Run 3:
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 »
> Run 4:
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 »
> Run 5:
> TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 »
> {noformat}
> {noformat}
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hbase.master.TestAssignmentManager
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 393.384 sec
> <<< FAILURE! - in org.apache.hadoop.hbase.master.TestAssignmentManager
> testBalanceOnMasterFailoverScenarioWithOfflineNode(org.apache.hadoop.hbase.master.TestAssignmentManager)
> Time elapsed: 57.873 sec <<< ERROR!
> java.lang.Exception: test timed out after 60000 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.hbase.master.TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode(TestAssignmentManager.java:335)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)