[jira] [Commented] (ZOOKEEPER-2660) acceptedEpoch and currentEpoch data inconsistency, ZK process can not start!

2017-01-20 Thread Arshad Mohammad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832183#comment-15832183
 ] 

Arshad Mohammad commented on ZOOKEEPER-2660:


Thanks [~Yongcheng] for reporting this issue, though it is duplicate of 
ZOOKEEPER-2660. I am closing it as duplicate. Please feel free to reopen If you 
disagree.

> acceptedEpoch and currentEpoch data inconsistency, ZK process can not start!
> 
>
> Key: ZOOKEEPER-2660
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2660
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.6, 3.4.9
> Environment: ZK: 3.4.9
>Reporter: Yongcheng Liu
>
> 1. currentEpoch is bigger than acceptedEpoch, ZK will throw IOException when 
> start loadDataBase.
> 2. function bug. In function setAcceptedEpoch and setCurrentEpoch, it is 
> modify memory variable first, then write epoch to file. If write file failed, 
> the memory has been modified.
> solution as follow:
> for example,
>   public void setAcceptedEpoch(long e) throws IOException {
>   acceptedEpoch = e;
>   writeLongToFile(ACCEPTED_EPOCH_FILENAME, e);
>   }
> need to modify as follow:
>   public void setAcceptedEpoch(long e) throws IOException {
>   writeLongToFile(ACCEPTED_EPOCH_FILENAME, e);
>   acceptedEpoch = e;
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2660) acceptedEpoch and currentEpoch data inconsistency, ZK process can not start!

2017-01-07 Thread Yongcheng Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807507#comment-15807507
 ] 

Yongcheng Liu commented on ZOOKEEPER-2660:
--

for example,
we have 3 ZK (zk1, zk2, zk3), zk1 is down, zk2 is follower, zk3 is leader.  The 
currentEpoch and acceptEpoch of zk2 and zk3 is 6.   The currentEpoch and 
acceptEpoch of zk1 is 5.
Then,  zk1 is start. zk1 will setAcceptedEpoch to 6 when become follwer,  in 
the memory, acceptedEpoch is already 6, because run function setAcceptedEpoch. 
But writeLongToFile run failed, it will lead to in the file acceptedEpoch is 
already 5.  Then throw abnormal, return to LOOKING, this time zk1 will not run 
function setAcceptedEpoch, because his acceptedEpoch is the same as 
leader(zk3), zk1 do not know his acceptedEpoch in the file already 5.  If zk1 
down again, zk1 will never start up. Because in the file, acceptedEpoch is 5, 
currentEpoch is 6.

> acceptedEpoch and currentEpoch data inconsistency, ZK process can not start!
> 
>
> Key: ZOOKEEPER-2660
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2660
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.6, 3.4.9
> Environment: ZK: 3.4.9
>Reporter: Yongcheng Liu
>
> 1. currentEpoch is bigger than acceptedEpoch, ZK will throw IOException when 
> start loadDataBase.
> 2. function bug. In function setAcceptedEpoch and setCurrentEpoch, it is 
> modify memory variable first, then write epoch to file. If write file failed, 
> the memory has been modified.
> solution as follow:
> for example,
>   public void setAcceptedEpoch(long e) throws IOException {
>   acceptedEpoch = e;
>   writeLongToFile(ACCEPTED_EPOCH_FILENAME, e);
>   }
> need to modify as follow:
>   public void setAcceptedEpoch(long e) throws IOException {
>   writeLongToFile(ACCEPTED_EPOCH_FILENAME, e);
>   acceptedEpoch = e;
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)