[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiafu Jiang updated ZOOKEEPER-3231:
-----------------------------------
    Description: 
I read the ZooKeeper source code, and I find the purge task use 
FileTxnSnapLog#findNRecentSnapshots to find snapshots, but the method does not 
check whether the snapshots are valid.

Consider a worse case, a ZooKeeper server may have many invalid snapshots, and 
when a purge task begins, it will use the zxid in the last snapshot's name to 
purge old snapshots and transaction logs, then we may lost data. 

I think we should use FileSnap#findNValidSnapshots(int) instead of 
FileSnap#findNRecentSnapshots in FileTxnSnapLog#findNRecentSnapshots, but I am 
not sure.

 

  was:
I read the ZooKeeper source code, and I find the purge task use 
FileTxnSnapLog#findNRecentSnapshots to find snapshots, but the method does not 
check whether the snapshots are valid.

Consider a worse case, a ZooKeeper server may have many invalid snapshots, and 
when a purge task begins, it will use the zxid in the last snapshot's name to 
purge old snapshots and transaction logs, then we may lost data. 

I think we should use FileSnap#findNValidSnapshots(int) instead of 
FileSnap#findNRecentSnapshots in FileTxnSnapLog#findNRecentSnapshots. I am not 
sure.

 


>  Purge task may lost data when we have many invalid snapshots.
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3231
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3231
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.4, 3.4.13
>            Reporter: Jiafu Jiang
>            Priority: Major
>
> I read the ZooKeeper source code, and I find the purge task use 
> FileTxnSnapLog#findNRecentSnapshots to find snapshots, but the method does 
> not check whether the snapshots are valid.
> Consider a worse case, a ZooKeeper server may have many invalid snapshots, 
> and when a purge task begins, it will use the zxid in the last snapshot's 
> name to purge old snapshots and transaction logs, then we may lost data. 
> I think we should use FileSnap#findNValidSnapshots(int) instead of 
> FileSnap#findNRecentSnapshots in FileTxnSnapLog#findNRecentSnapshots, but I 
> am not sure.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to