Liu Shaohui created HBASE-12865:
-----------------------------------

             Summary: Wals may be deleted before they are replicated to peers
                 Key: HBASE-12865
                 URL: https://issues.apache.org/jira/browse/HBASE-12865
             Project: HBase
          Issue Type: Bug
            Reporter: Liu Shaohui


By design, ReplicationLogCleaner guarantee that the WALs  being in replication 
queue can't been deleted by the HMaster. The ReplicationLogCleaner gets the WAL 
set from zookeeper by scanning the replication zk node. But it may get 
uncompleted WAL set during replication failover for the scan operation is not 
atomic.

For example: There are three region servers: rs1, rs2, rs3, and peer id 10.  
The layout of replication zookeeper nodes is:
{code}
/hbase/replication/rs/rs1/10/wals
                     /rs2/10/wals
                     /rs3/10/wals
{code}
- t1: the ReplicationLogCleaner finished scanning the replication queue of rs1, 
and start to scan the queue of rs2.
- t2: region server rs3 is down, and rs1 take over rs3's replication queue. The 
new layout is

{code}
/hbase/replication/rs/rs1/10/wals
                     /rs1/10-rs3/wals
                     /rs2/10/wals
                     /rs3
{code}
- t3, the ReplicationLogCleaner finished scanning the queue of rs2, and start 
to scan the node of rs3. But the the queue has been moved to  
"replication/rs1/10-rs3/WALS"

So the  ReplicationLogCleaner will miss the WALs of rs3 in peer 10 and the 
hmaster may delete these WALs before they are replicated to peer clusters.

We encountered this problem in our cluster and I think it's a serious bug for 
replication.

Suggestions are welcomed to fix this bug. thx~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to