[
https://issues.apache.org/jira/browse/HBASE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890518#action_12890518
]
HBase Review Board commented on HBASE-2858:
-------------------------------------------
Message from: "Jean-Daniel Cryans" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/349/
-----------------------------------------------------------
Review request for hbase.
Summary
-------
This patch fixes the ZKW.listZNodes issue and clears up a bit the path
situation in ReplicationSource by removing a lock and adding wits to figure
where the log is moved. The test now passes 100% of the time for me (up from
50%).
There's one open issues as outlined by the two TODOS, what happens if a log is
missing from HDFS? When the queue is recovered, it could mean that HDFS was
cleared but not ZK, but during normal operations it would point to a bug?
Report and continue?
This addresses bug HBASE-2858.
http://issues.apache.org/jira/browse/HBASE-2858
Diffs
-----
src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e6b365e
src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
a037aae
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
6b9dcb5
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
2e13a0a
src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java e8dd268
src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
163671f
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
bb09bc3
Diff: http://review.hbase.org/r/349/diff
Testing
-------
Unit testing.
Thanks,
Jean-Daniel
> TestReplication.queueFailover fails half the time
> -------------------------------------------------
>
> Key: HBASE-2858
> URL: https://issues.apache.org/jira/browse/HBASE-2858
> Project: HBase
> Issue Type: Bug
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.90.0
>
>
> TestReplication.queueFailover fails 50% of the time, it's because
> ZooKeeperWrapper.listZnodes (introduced in HBASE-2694 and missed by
> HBASE-2735) doesn't use the Watcher it's passed so sometimes
> ReplicationSource misses hlogs to replicate for the region server we kill.
> Also it uncovered an issue (while I was fixing the first one) that RepSource
> ignores log files too quickly when the master is a bit too slow to split logs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.