[ 
https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894153#comment-13894153
 ] 

Hadoop QA commented on HBASE-10482:
-----------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12627520/HBASE-10249.patch
  against trunk revision .
  ATTACHMENT ID: 12627520

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

    {color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

    {color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8620//console

This message is automatically generated.

> ReplicationSyncUp doesn't clean up its ZK, needed for tests
> -----------------------------------------------------------
>
>                 Key: HBASE-10482
>                 URL: https://issues.apache.org/jira/browse/HBASE-10482
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.96.1, 0.94.16
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17
>
>         Attachments: HBASE-10249.patch
>
>
> TestReplicationSyncUpTool failed again:
> https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/
> It's not super obvious why only one of the two tables is replicated, the test 
> could use some more logging, but I understand it this way:
> The first ReplicationSyncUp gets started and for some reason it cannot 
> replicate the data:
> {noformat}
> 2014-02-06 21:32:19,811 INFO  [Thread-1372] 
> regionserver.ReplicationSourceManager(203): Current list of replicators: 
> [1391722339091.SyncUpTool.replication.org,1234,1, 
> quirinus.apache.org,37045,1391722237951, 
> quirinus.apache.org,33502,1391722238125] other RSs: []
> 2014-02-06 21:32:19,811 INFO  [Thread-1372.replicationSource,1] 
> regionserver.ReplicationSource(231): Replicating 
> db42e7fc-7f29-4038-9292-d85ea8b9994b -> 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817
> 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1
> 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1
> 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2
> ...
> 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8
> 2014-02-06 21:32:23,673 INFO  [ReplicationExecutor-0] 
> replication.ReplicationQueuesZKImpl(169): Moving 
> quirinus.apache.org,37045,1391722237951's hlogs to my queue
> 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] 
> replication.ReplicationQueuesZKImpl(396): Creating 
> quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803
> 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] 
> replication.ReplicationQueuesZKImpl(396): Creating 
> quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803
> 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9
> 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] 
> regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9
> {noformat}
> Finally it gives up:
> {noformat}
> 2014-02-06 21:32:30,873 DEBUG [Thread-1372] 
> replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry 
> = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200
> {noformat}
> The syncUp tool has an ID you can follow, grep for 
> syncupReplication1391722338885 or just the timestamp, and you can see it 
> doing things after that. The reason is that the tool closes the 
> ReplicationSourceManager but not the ZK connection, so events _still_ come in 
> and NodeFailoverWorker _still_ tries to recover queues but then there's 
> nothing to process them.
> Later in the logs you can see:
> {noformat}
> 2014-02-06 21:32:37,381 INFO  [ReplicationExecutor-0] 
> replication.ReplicationQueuesZKImpl(169): Moving 
> quirinus.apache.org,33502,1391722238125's hlogs to my queue
> 2014-02-06 21:32:37,567 INFO  [ReplicationExecutor-0] 
> replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another 
> RS took care of it because of: KeeperErrorCode = NoNode for 
> /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock
> {noformat}
> There shouldn't' be any racing, but now someone already moved 
> "quirinus.apache.org,33502,1391722238125" away.
> FWIW I can't even make the test fail on my machine so I'm not 100% sure 
> closing the ZK connection fixes the issue, but at least it's the right thing 
> to do.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to