[jira] [Commented] (ZOOKEEPER-1177) Enabling a large number of watches for a large number of clients

2024-04-26 Thread zhangzhisheng (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841060#comment-17841060
 ] 

zhangzhisheng commented on ZOOKEEPER-1177:
--

(y)

> Enabling a large number of watches for a large number of clients
> 
>
> Key: ZOOKEEPER-1177
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1177
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-1177.patch, ZOOKEEPER-1177.patch, 
> ZooKeeper-with-fix-for-findbugs-warning.patch, ZooKeeper.patch, 
> Zookeeper-after-resolving-merge-conflicts.patch
>
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> In my ZooKeeper, I see watch manager consuming several GB of memory and I dug 
> a bit deeper.
> In the scenario I am testing, I have 10K clients connected to an observer. 
> There are about 20K znodes in ZooKeeper, each is about 1K - so about 20M data 
> in total.
> Each client fetches and puts watches on all the znodes. That is 200 million 
> watches.
> It seems a single watch takes about 100  bytes. I am currently at 14528037 
> watches and according to the yourkit profiler, WatchManager has 1.2 G 
> already. This is not going to work as it might end up needing 20G of RAM just 
> for the watches.
> So we need a more compact way of storing watches. Here are the possible 
> solutions.
> 1. Use a bitmap instead of the current hashmap. In this approach, each znode 
> would get a unique id when its gets created. For every session, we can keep 
> track of a bitmap that indicates the set of znodes this session is watching. 
> A bitmap, assuming a 100K znodes, would be 12K. For 10K sessions, we can keep 
> track of watches using 120M instead of 20G.
> 2. This second idea is based on the observation that clients watch znodes in 
> sets (for example all znodes under a folder). Multiple clients watch the same 
> set and the total number of sets is a couple of orders of magnitude smaller 
> than the total number of znodes. In my scenario, there are about 100 sets. So 
> instead of keeping track of watches at the znode level, keep track of it at 
> the set level. It may mean that get may also need to be implemented at the 
> set level. With this, we can save the watches in 100M.
> Are there any other suggestions of solutions?
> Thanks
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4829) Support DatadirCleanup in minutes

2024-04-24 Thread Purshotam Shah (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840507#comment-17840507
 ] 

Purshotam Shah commented on ZOOKEEPER-4829:
---

I'll be contributing a patch.

Indeed, the problem was resolved by switching to a larger disk. The system 
generated many snapshots, and with the deletion interval set at one hour, the 
disk filled up quickly.

However, we only needed a few snapshots. In this case, the larger disk was not 
necessary.

> Support DatadirCleanup in minutes
> -
>
> Key: ZOOKEEPER-4829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4829
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Purshotam Shah
>Priority: Major
>
> On the cloud, space can be limited. Currently, the DatadirCleanup only 
> supports hours; we should also support cleanup intervals in minutes.
>  
> {code:java}
> 2024-02-20 20:55:28,862 - WARN  
> [QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
>  - Exception when following the leader
> java.io.IOException: No space left on device
>     at java.base/java.io.FileOutputStream.writeBytes(Native Method)
>     at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
>     at 
> org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
>     at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>     at 
> java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312)
>     at java.base/sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:316)
>     at java.base/sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:153)
>     at java.base/java.io.OutputStreamWriter.flush(OutputStreamWriter.java:251)
>     at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:257)
>     at 
> org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:72)
>     at 
> org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:54)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:2229)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:2258)
>     at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:511)
>     at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:91)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1551)
> 2024-02-20 20:55:28,863 - INFO  
> [QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@145]{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-3938) Upgrade jline to version 3.x.

2024-04-24 Thread Andor Molnar (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840474#comment-17840474
 ] 

Andor Molnar commented on ZOOKEEPER-3938:
-

[~lkovacs] I assigned the ticket to you and added you to ZK contributors list. 
Thanks for working on this!

> Upgrade jline to version 3.x.
> -
>
> Key: ZOOKEEPER-3938
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3938
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: java client
>Reporter: LiYvbo
>Assignee: Luca Kovacs
>Priority: Major
> Attachments: image-2020-09-17-15-10-26-491.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Jline3 have several versions. JLine 3.x is an evolution of JLine 2.x. But 
> some apis are expired.
> Jline should be upgrade.
> !image-2020-09-17-15-10-26-491.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ZOOKEEPER-3938) Upgrade jline to version 3.x.

2024-04-24 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar reassigned ZOOKEEPER-3938:
---

Assignee: Luca Kovacs

> Upgrade jline to version 3.x.
> -
>
> Key: ZOOKEEPER-3938
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3938
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: java client
>Reporter: LiYvbo
>Assignee: Luca Kovacs
>Priority: Major
> Attachments: image-2020-09-17-15-10-26-491.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Jline3 have several versions. JLine 3.x is an evolution of JLine 2.x. But 
> some apis are expired.
> Jline should be upgrade.
> !image-2020-09-17-15-10-26-491.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4830) zk_learners is incorrectly referenced as zk_followers

2024-04-23 Thread Zili Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen resolved ZOOKEEPER-4830.
--
  Assignee: Zili Chen
Resolution: Fixed

master via 
https://github.com/apache/zookeeper/commit/ee994fbca51f826e4b26d6a105866975d0007f6e.

> zk_learners is incorrectly referenced as zk_followers
> -
>
> Key: ZOOKEEPER-4830
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4830
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.4, 3.9.2
>Reporter: Nicholas Feinberg
>Assignee: Zili Chen
>Priority: Trivial
>  Labels: pull-request-available
>   Original Estimate: 10m
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3117 renamed the 
> `zk_followers` metric to `zk_learners`, but some references to `zk_followers` 
> remained in the repo, including in the documentation. These should be 
> corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4830) zk_learners is incorrectly referenced as zk_followers

2024-04-23 Thread Zili Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen updated ZOOKEEPER-4830:
-
Fix Version/s: 3.10.0

> zk_learners is incorrectly referenced as zk_followers
> -
>
> Key: ZOOKEEPER-4830
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4830
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.4, 3.9.2
>Reporter: Nicholas Feinberg
>Assignee: Zili Chen
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.10.0
>
>   Original Estimate: 10m
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3117 renamed the 
> `zk_followers` metric to `zk_learners`, but some references to `zk_followers` 
> remained in the repo, including in the documentation. These should be 
> corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4830) zk_learners is incorrectly referenced as zk_followers

2024-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4830:
--
Labels: pull-request-available  (was: )

> zk_learners is incorrectly referenced as zk_followers
> -
>
> Key: ZOOKEEPER-4830
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4830
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.4, 3.9.2
>Reporter: Nicholas Feinberg
>Priority: Trivial
>  Labels: pull-request-available
>   Original Estimate: 10m
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3117 renamed the 
> `zk_followers` metric to `zk_learners`, but some references to `zk_followers` 
> remained in the repo, including in the documentation. These should be 
> corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4829) Support DatadirCleanup in minutes

2024-04-23 Thread Enrico Olivelli (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840275#comment-17840275
 ] 

Enrico Olivelli commented on ZOOKEEPER-4829:


Would you like to contribute a patch?

 

 

Side comment: this is a pretty unusual request, I have seeing many 
installations of ZK in the cloud on k8s and I have heard about such problem.

Did you consider using a bigger disk? Running maintenance top often may impact 
latency.

> Support DatadirCleanup in minutes
> -
>
> Key: ZOOKEEPER-4829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4829
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Purshotam Shah
>Priority: Major
>
> On the cloud, space can be limited. Currently, the DatadirCleanup only 
> supports hours; we should also support cleanup intervals in minutes.
>  
> {code:java}
> 2024-02-20 20:55:28,862 - WARN  
> [QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
>  - Exception when following the leader
> java.io.IOException: No space left on device
>     at java.base/java.io.FileOutputStream.writeBytes(Native Method)
>     at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
>     at 
> org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
>     at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>     at 
> java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312)
>     at java.base/sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:316)
>     at java.base/sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:153)
>     at java.base/java.io.OutputStreamWriter.flush(OutputStreamWriter.java:251)
>     at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:257)
>     at 
> org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:72)
>     at 
> org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:54)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:2229)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:2258)
>     at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:511)
>     at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:91)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1551)
> 2024-02-20 20:55:28,863 - INFO  
> [QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@145]{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4830) zk_learners is incorrectly referenced as zk_followers

2024-04-23 Thread Nicholas Feinberg (Jira)
Nicholas Feinberg created ZOOKEEPER-4830:


 Summary: zk_learners is incorrectly referenced as zk_followers
 Key: ZOOKEEPER-4830
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4830
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.9.2, 3.8.4
Reporter: Nicholas Feinberg


https://issues.apache.org/jira/browse/ZOOKEEPER-3117 renamed the `zk_followers` 
metric to `zk_learners`, but some references to `zk_followers` remained in the 
repo, including in the documentation. These should be corrected.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4829) Support DatadirCleanup in minutes

2024-04-23 Thread Purshotam Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purshotam Shah updated ZOOKEEPER-4829:
--
Description: 
On the cloud, space can be limited. Currently, the DatadirCleanup only supports 
hours; we should also support cleanup intervals in minutes.
 
{code:java}
2024-02-20 20:55:28,862 - WARN  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
 - Exception when following the leader
java.io.IOException: No space left on device
    at java.base/java.io.FileOutputStream.writeBytes(Native Method)
    at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
    at 
org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
    at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
    at 
java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312)
    at java.base/sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:316)
    at java.base/sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:153)
    at java.base/java.io.OutputStreamWriter.flush(OutputStreamWriter.java:251)
    at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:257)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:72)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:54)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:2229)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:2258)
    at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:511)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:91)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1551)
2024-02-20 20:55:28,863 - INFO  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@145]{code}

  was:
On the cloud, space can be limited. Currently, the DatadirCleanup only supports 
hours; we should also support cleanup intervals in minutes.
 

2024-02-20 20:55:28,862 - WARN  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
 - Exception when following the leader
java.io.IOException: No space left on device
    at java.base/java.io.FileOutputStream.writeBytes(Native Method)
    at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
    at 
org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
    at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
    at 
java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312)
    at java.base/sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:316)
    at java.base/sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:153)
    at java.base/java.io.OutputStreamWriter.flush(OutputStreamWriter.java:251)
    at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:257)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:72)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:54)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:2229)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:2258)
    at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:511)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:91)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1551)
2024-02-20 20:55:28,863 - INFO  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@145]


> Support DatadirCleanup in minutes
> -
>
> Key: ZOOKEEPER-4829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4829
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Purshotam Shah
>Priority: Major
>
> On the cloud, space can be limited. Currently, the DatadirCleanup only 
> supports hours; we should also support cleanup intervals in minutes.
>  
> {code:java}
> 2024-02-20 20:55:28,862 - WARN  
> [QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
>  - Exception when following the leader
> java.io.IOException: No space left on device
>     at java.base/java.io.FileOutputStream.writeBytes(Native Method)
>     at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
>     at 
> org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
>     at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>     at 
> java.base/sun.nio.

[jira] [Created] (ZOOKEEPER-4829) Support DatadirCleanup in minutes

2024-04-23 Thread Purshotam Shah (Jira)
Purshotam Shah created ZOOKEEPER-4829:
-

 Summary: Support DatadirCleanup in minutes
 Key: ZOOKEEPER-4829
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4829
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Purshotam Shah


On the cloud, space can be limited. Currently, the DatadirCleanup only supports 
hours; we should also support cleanup intervals in minutes.
 

2024-02-20 20:55:28,862 - WARN  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@131]
 - Exception when following the leader
java.io.IOException: No space left on device
    at java.base/java.io.FileOutputStream.writeBytes(Native Method)
    at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
    at 
org.apache.zookeeper.common.AtomicFileOutputStream.write(AtomicFileOutputStream.java:72)
    at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
    at 
java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312)
    at java.base/sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:316)
    at java.base/sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:153)
    at java.base/java.io.OutputStreamWriter.flush(OutputStreamWriter.java:251)
    at java.base/java.io.BufferedWriter.flush(BufferedWriter.java:257)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:72)
    at 
org.apache.zookeeper.common.AtomicFileWritingIdiom.(AtomicFileWritingIdiom.java:54)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:2229)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:2258)
    at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:511)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:91)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1551)
2024-02-20 20:55:28,863 - INFO  
[QuorumPeer[myid=5](plain=disabled)(secure=[0:0:0:0:0:0:0:0]:50512):o.a.z.s.q.Follower@145]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-3938) Upgrade jline to version 3.x.

2024-04-23 Thread Luca Kovacs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840007#comment-17840007
 ] 

Luca Kovacs commented on ZOOKEEPER-3938:


[~LiYvbo] I'm planning to work on this ticket in the near future. 
Are you okay with me assigning it to myself?

> Upgrade jline to version 3.x.
> -
>
> Key: ZOOKEEPER-3938
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3938
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: java client
>Reporter: LiYvbo
>Priority: Major
> Attachments: image-2020-09-17-15-10-26-491.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Jline3 have several versions. JLine 3.x is an evolution of JLine 2.x. But 
> some apis are expired.
> Jline should be upgrade.
> !image-2020-09-17-15-10-26-491.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4828) Minor 3.9 broke custom TLS setup with ssl.context.supplier.class

2024-04-22 Thread Jon Marius Venstad (Jira)
Jon Marius Venstad created ZOOKEEPER-4828:
-

 Summary: Minor 3.9 broke custom TLS setup with 
ssl.context.supplier.class
 Key: ZOOKEEPER-4828
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4828
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Jon Marius Venstad


We run embedded ZooKeeper in Vespa, and use a custom TLS stack, where we, e.g., 
do additional validation and authorisation in our TLS trust manager, for client 
certificates.
The changes in 
https://github.com/apache/zookeeper/commit/4a794276d3d371071c31f86c14da824fdd2e53c0,
 done for ZOOKEEPER-4622,
broke the `ssl.context.supplier.class configuration parameter`, documented in 
the ZK admin guide
(https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_configuration).
Consequently, current code (3.9.2) now enforces a file-based key and trust 
store for _any TLS_, which is not an option for us.

I looked at two ways to fix this:
1. Add new configuration parameters for _key_ and _trust_ store _suppliers_, as 
an alternative to the key and trust store _files_ required in the new (with 
3.9.0)
   ClientX509Util code—this adds another pair of config options, of which there 
are already plenty, and the user is stuck with
   the default JDK `Provider` (optional argument to 
SSLContext.getInstance(protocols, provider); it also lets users with a
   custom key and trust store use the native SSL support of Netty.
   Oh, and, Netty provides the option to specify a JDK `Provider` in the 
SslContextBuilder, too, so that _could_ be made configurable as well.
2. Restore the option of specifying a custom SSL context, and prefer this over 
using the Netty SslContextBuilder in the new
   ClientX509Util code, when present—this lets users specify a JDK `Provider`, 
but file based key and trust stores will be required for the native SSL added 
in 3.9.0.

I don't have a strong opinion on which option is better. I can also contribute 
a code change with either.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4825) CVE-2023-6378 is present in the current logback version (1.2.13) and hence we need to upgrade to 1.4.12

2024-04-17 Thread Vivek (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838311#comment-17838311
 ] 

Vivek commented on ZOOKEEPER-4825:
--

can this be assigned to me?

> CVE-2023-6378 is present in the current logback version (1.2.13) and hence we 
> need to upgrade to 1.4.12
> ---
>
> Key: ZOOKEEPER-4825
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4825
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: Bhavya hoda
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4825) CVE-2023-6378 is present in the current logback version (1.2.13) and hence we need to upgrade to 1.4.12

2024-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4825:
--
Labels: pull-request-available  (was: )

> CVE-2023-6378 is present in the current logback version (1.2.13) and hence we 
> need to upgrade to 1.4.12
> ---
>
> Key: ZOOKEEPER-4825
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4825
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: Bhavya hoda
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4827) Bump bouncycastl version from 1.75 to 1.78

2024-04-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4827:
--
Labels: pull-request-available  (was: )

> Bump bouncycastl version from 1.75 to 1.78
> --
>
> Key: ZOOKEEPER-4827
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4827
> Project: ZooKeeper
>  Issue Type: Task
>Reporter: ZhangJian He
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Upgrade Bouncy Castle to 1.78 to address CVEs
> https://bouncycastle.org/releasenotes.html#r1rv78
> - https://www.cve.org/CVERecord?id=CVE-2024-29857 (reserved)
>   - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6613079
> - https://www.cve.org/CVERecord?id=CVE-2024-30171 (reserved)
>   - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6613076
> - https://www.cve.org/CVERecord?id=CVE-2024-30172 (reserved)
>   - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6612984



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4827) Bump bouncycastl version from 1.75 to 1.78

2024-04-16 Thread ZhangJian He (Jira)
ZhangJian He created ZOOKEEPER-4827:
---

 Summary: Bump bouncycastl version from 1.75 to 1.78
 Key: ZOOKEEPER-4827
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4827
 Project: ZooKeeper
  Issue Type: Task
Reporter: ZhangJian He


Upgrade Bouncy Castle to 1.78 to address CVEs
https://bouncycastle.org/releasenotes.html#r1rv78

- https://www.cve.org/CVERecord?id=CVE-2024-29857 (reserved)
  - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6613079
- https://www.cve.org/CVERecord?id=CVE-2024-30171 (reserved)
  - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6613076
- https://www.cve.org/CVERecord?id=CVE-2024-30172 (reserved)
  - https://security.snyk.io/vuln/SNYK-JAVA-ORGBOUNCYCASTLE-6612984



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4826) Reduce unnecessary executable permissions on files

2024-04-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4826:
--
Labels: pull-request-available  (was: )

> Reduce unnecessary executable permissions on files
> --
>
> Key: ZOOKEEPER-4826
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4826
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: ZhangJian He
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ***Summary:*** This patch aims to modify the permissions of various files 
> within the ZooKeeper repository that currently have executable permissions 
> set (755) but do not require such permissions for their operation. Changing 
> these permissions to 644 enhances security and maintains the consistency of 
> file permissions throughout the project. ***Details:*** Several 
> non-executable files (not including scripts or executable binaries) are 
> currently set with executable permissions. This is generally unnecessary and 
> can lead to potential security concerns. This patch will adjust these 
> permissions to a more appropriate setting (644), which is sufficient for 
> reading and writing operations but does not allow execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4826) Reduce unnecessary executable permissions on files

2024-04-16 Thread ZhangJian He (Jira)
ZhangJian He created ZOOKEEPER-4826:
---

 Summary: Reduce unnecessary executable permissions on files
 Key: ZOOKEEPER-4826
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4826
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: ZhangJian He


***Summary:*** This patch aims to modify the permissions of various files 
within the ZooKeeper repository that currently have executable permissions set 
(755) but do not require such permissions for their operation. Changing these 
permissions to 644 enhances security and maintains the consistency of file 
permissions throughout the project. ***Details:*** Several non-executable files 
(not including scripts or executable binaries) are currently set with 
executable permissions. This is generally unnecessary and can lead to potential 
security concerns. This patch will adjust these permissions to a more 
appropriate setting (644), which is sufficient for reading and writing 
operations but does not allow execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4712) Follower.shutdown() and Observer.shutdown() do not correctly shutdown the syncProcessor, which may lead to data inconsistency

2024-04-13 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4712:
--
Affects Version/s: 3.9.1
   3.9.0

> Follower.shutdown() and Observer.shutdown() do not correctly shutdown the 
> syncProcessor, which may lead to data inconsistency
> -
>
> Key: ZOOKEEPER-4712
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4712
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.5.10, 3.6.3, 3.7.0, 3.8.0, 3.7.1, 3.6.4, 3.9.0, 3.8.1, 
> 3.9.1
>Reporter: Sirius
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Follower.shutdown() and Observer.shutdown() do not correctly shutdown the 
> syncProcessor. It may lead to potential data inconsistency (see {*}Potential 
> Risk{*}).
>  
> A follower / observer will invoke syncProcessor.shutdown() in 
> LearnerZooKeeperServer.shutdown() / ObserverZooKeeperServer.shutdown(), 
> respectively.
> However, after the 
> [FIX|https://github.com/apache/zookeeper/commit/efbd660e1c4b90a8f538f2cccb5dcb7094cf9a22]
>  of ZOOKEEPER-3642, Follower.shutdown() / Observer.shutdown() will not invoke 
> LearnerZooKeeperServer.shutdown() / ObserverZooKeeperServer.shutdown() 
> anymore.
>  
> h2. Method Invocation Path
> h5. Version 3.8.1 / 3.8.0 / 3.7.1 / 3.7.0 / 3.6.4 / 3.6.3 / 3.5.10 ...
>  * *(Buggy)* Observer.shutdown() -> Learner.shutdown() -> 
> ZooKeeperServer.shutdown(boolean)
>  * *(Buggy)* Follower.shutdown() -> Learner.shutdown() -> 
> ZooKeeperServer.shutdown(boolean)
>  * (For comparison) Leader.shutdown(String) ->  LeaderZooKeeper.shutdown() -> 
> ZooKeeperServer.shutdown() -> ZooKeeperServer.shutdown(boolean)
>  
> h5. For comparison, in version 3.4.X,
>  * Observer.shutdown() -> Learner.shutdown() -> 
> {*}ObserverZooKeeperServer.shutdown() -{*}> ZooKeeperServer.shutdown() -> 
> ZooKeeperServer.shutdown(boolean)
>  * Follower.shutdown() -> Learner.shutdown() -> 
> {*}FollowerZooKeeperServer.shutdown() -{*}> ZooKeeperServer.shutdown() -> 
> ZooKeeperServer.shutdown(boolean)
>  
> h5. Or, in version 3.6.0,
>  * Observer.shutdown() -> Learner.shutdown() -> 
> {*}LearnerZooKeeperServer.shutdown() -{*}> ZooKeeperServer.shutdown() -> 
> ZooKeeperServer.shutdown(boolean)
>  * Follower.shutdown() -> Learner.shutdown() -> 
> {*}Learner{*}{*}ZooKeeperServer.shutdown() -{*}> ZooKeeperServer.shutdown() 
> -> ZooKeeperServer.shutdown(boolean)
>  
> h2. Code Details
> Take version 3.8.0 as an example.
> In Follower.shutdown() :
> {code:java}
>     public void shutdown() {
>         LOG.info("shutdown Follower");
> +       // invoke Learner.shutdown()
>         super.shutdown();   
>     } {code}
>  
> In Learner.java:
> {code:java}
>     public void shutdown() {
>         ...
>         // shutdown previous zookeeper
>         if (zk != null) {
>             // If we haven't finished SNAP sync, force fully shutdown
>             // to avoid potential inconsistency
> +           // This will invoke ZooKeeperServer.shutdown(boolean), 
> +           // which will not shutdown syncProcessor
> +           // Before the fix of ZOOLEEPER-3642, 
> +           // FollowerZooKeeperServer.shutdown() will be invoked here
>             zk.shutdown(self.getSyncMode().equals(QuorumPeer.SyncMode.SNAP)); 
>          }
>     } {code}
>  
> In ZooKeeperServer.java:
> {code:java}
>     public synchronized void shutdown(boolean fullyShutDown) {
>         ...
>         if (firstProcessor != null) {
> +           // For a follower, this will not shutdown its syncProcessor.
>             firstProcessor.shutdown(); 
>         }
>         ...
>     } {code}
>  
> In expectation, Follower.shutdown() should invoke 
> LearnerZooKeeperServer.shutdown() to shutdown the syncProcessor:
> {code:java}
>     public synchronized void shutdown() {
>         ...
>         try {
> +           // shutdown the syncProcessor here
>             if (syncProcessor != null) {
>                 syncProcessor.shutdown();     
>             }
>         } ...
>     } {code}
> Observer.shutdown() has the similar problem.
>  
> h2. Potential Risk
> When Follower.shutdown() is called, the follower's QuorumPeer thread may 
> update the lastProcessedZxid for the election and recovery phase before its 
> syncThread drains the pending requests and flushes them to disk.
> In consequence, this lastProcessedZxid is not the latest zxid in its log, 
> leading to log inconsistency after the SYNC phase. (Similar to the symptoms 
> of ZOOKEEPER-2845.)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4685) Unnecessary system unavailability due to Leader shutdown when follower sent ACK of PROPOSAL before sending ACK of NEWLEADER in log recovery

2024-04-13 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4685:
--
Affects Version/s: 3.9.1
   3.9.0

> Unnecessary system unavailability due to Leader shutdown when follower sent 
> ACK of PROPOSAL before sending ACK of NEWLEADER in log recovery
> ---
>
> Key: ZOOKEEPER-4685
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4685
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.6.3, 3.7.0, 3.8.0, 3.7.1, 3.9.0, 3.8.1, 3.9.1
>Reporter: Sirius
>Priority: Major
>
> When a follower is processing the NEWLEADER message in SYNC phase, its 
> QuorumPeer thread will call {{logRequest(..)}} to submit the txn persistence 
> task to the SyncThread. The SyncThread may persist txns and reply ACKs of 
> them before replying ACK-LD (i.e. ACK of NEWLEADER) to the leader. This may 
> cause the consequence that the leader cannot collect enough number of ACK-LDs 
> successfully, followed by the leader's shutdown and a new round of election. 
> This introduces unnecessary recovery procedures, consumes extra time before 
> servers get into the BROADCAST phase and reduces the service's availability a 
> lot. 
> The following trace can be generated in the latest version nowadays.
>  
> h2. Trace
> Start the ensemble with three nodes: S{+}0{+}, +S1+ & {+}S2{+}.
>  - +S2+ is elected leader.
>  - +S2+ logs a new txn <1, 1> and makes a broadcast.
>  - +S0+ restarts & +S1+ crashes before receiving the proposal of <1, 1>.
>  - +S2+ is elected leader again. 
>  - +S2+ syncs with +S0+ using DIFF, and sends the proposal of <1, 1> during 
> SYNC.
>  - After +S0+ receives NEWLEADER,  {+}S0{+}'s SyncThread may persist the txn 
> <1, 1> and reply corresponding ACK to the leader +S2+ before {+}S0{+}'s 
> QuorumPeer thread replies ACK-LD to the leader +S2+ .(This is possible 
> because txn logging is processed asynchronously by SyncThread! )
>  - The corresponding LearnerHandler on +S2+ cannot recognize the ACK of some 
> proposal before ACK-LD, and will be blocked at _waitForStartup()_ until the 
> leader turns its state to {_}state.RUNNING{_}.
>  - However, the QuorumPeer of the leader +S2+ will not receive enough number 
> of ACK-LDs before timeout, and then throws _InterruptedException_ during 
> {_}waitForNewLeaderAck(..){_}.
>  - After that, the leader will shutdown and a new round of election is 
> raised, which consumes extra time for establishing the quorum and reduces 
> availability a lot.
>  
> h2. Analysis
> *Root Cause:*
> Similar to  ZOOKEEPER-4646 , the root cause lies in the asynchronous 
> executions by multi-threads on the follower side.The implementation adopts 
> the multi-threading style for performance optimization. However, it may bring 
> some underlying subtle bugs.
> When the follower receives NEWLEADER, it calls {{logRequest(..)}} to submit 
> the logging requests to SyncRequestProcessor's queue before replying ACK-LD. 
> The SyncThread may be scheduled to persist the txns and reply ACK(s) before 
> the QuorumPeer thread replies ACK-LD.
> On the leader side, the corresponding learnerHandler will not recognize the 
> ACK of PROPOSAL during _waitForNewLeaderAck(..)_ , and then will be blocked 
> at _waitForStartup()_ until the leader turns its state to 
> {_}state.RUNNING{_}. However, the leader may not receive enough ACK-LDs 
> before timeout, and then throws _InterruptedException_ during 
> {_}waitForNewLeaderAck(..){_}. After that, the leader will shutdown and the 
> servers go into a new election and recovery process. 
> This recovery procedure consumes extra recovery time and increase system 
> unavailability time.
> To some degree it is unnecessary. It can be fixed and optimized by 
> guaranteeing the follower side's message order.
>  
> *Affected Versions:*
> The above trace has been generated in multiple versions such as 3.7.1 & 3.8.1 
> (the latest stable & current version till now) by our testing tools. The 
> affected versions might be more, since the critical partial order between the 
> follower's replying ACK-LD and replying ACK of PROPOSAL during SYNC stay 
> non-deterministic in multiple versions.
>  
> h2. Possible Fix
> Considering this issue and ZOOKEEPER-4646 , one possible fix is to guarantee 
> the following partial orders to be satisfied:
>  * The follower replies ACK of PROPOSAL only after it rep

[jira] [Updated] (ZOOKEEPER-4646) Committed txns may still be lost if followers crash after replying ACK of NEWLEADER but before writing txns to disk

2024-04-13 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4646:
--
Affects Version/s: 3.9.1
   3.9.0

> Committed txns may still be lost if followers crash after replying ACK of 
> NEWLEADER but before writing txns to disk
> ---
>
> Key: ZOOKEEPER-4646
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4646
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.6.3, 3.7.0, 3.8.0, 3.7.1, 3.9.0, 3.8.1, 3.9.1
>Reporter: Sirius
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Trace-ZK-4646.pdf
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When a follower is processing the NEWLEADER message in SYNC phase, its 
> QuorumPeer thread will call {{logRequest(..)}} to submit the txn persistence 
> task to the SyncThread. The SyncThread will persist txns asynchronously and 
> does not promise to finish the task before the follower replies ACK-LD (i.e. 
> ACK of NEWLEADER) to the leader, which may lead to committed data loss.
> Actually, this problem had been first raised in ZOOKEEPER-3911 . However, the 
> fix of  ZOOKEEPER-3911  does not solve the problem at the root. The following 
> trace can still be generated in the latest version nowadays.
>  
> h2. Trace
> [^Trace-ZK-4646.pdf]
> The trace is basically the same as the one in ZOOKEEPER-3911 (See the first 
> comment provided by [~hanm] in that issue).  For convenience we use the zxid 
> to represent a txn here.
> Start the ensemble with three nodes: S{+}0{+}, +S1+ & {+}S2{+}.
>  - +S2+ is elected leader.
>  - All of them have the same log with the last zxid <1, 3>.
>  - +S2+ logs a new txn <1, 4> and makes a broadcast.
>  - +S0+ & +S1+ crash before they receive the proposal of <1, 4>.
>  - +S0+ & +S1+ restart.
>  - +S2+ is elected leader again. 
>  - +S0+ & +S1+ DIFF sync with +S2+ .
>  - +S0+ & +S1+ send ACK-LD to +S2+ *before* their SyncThreads log txns to 
> disk. (This is possible because txn logging is processed asynchronously! )
>  - Verify clients of +S2+ have the view of <1, 4>.
>  - The followers +S0+ & +S1+ crash *before* their SyncThreads persist txns to 
> disk. (This is extremely timing sensitive but possible! )
>  - +S0+ & +S1+ restart, and +S2+ crashes.
>  - Verify clients of +S0+ & +S1+ do NOT have the view of <1, 4>, a violation 
> of ZAB.
>  
> Extra note: The trace can be constructed with quorum nodes alive at any 
> moment with careful time tuning of node shutdown & restart, e.g., let +S0+ & 
> +S1+ shutdown and restart one by one in a short time.
>  
> h2. Analysis
> *Root Cause:*
> The root cause lies in the asynchronous executions by multi-threads.
> When a follower replies ACK-LD, it should promise that it has already logged 
> the initial history of the leader (according to ZAB). However, txn logging is 
> executed by the SyncThread asynchronously, so the above promise can be 
> violated. It is possible that, after the leader receives ACK-LD, believing 
> that the responding follower has been in sync, and then gets into the 
> BROADCAST phase, while in fact the history of the follower is not in sync 
> yet. At this time, environment failures might prevent the follower from 
> logging successfully. When that node with stale or incomplete committed 
> history is elected leader later, it might lose txns that have been committed 
> and applied on the former leader node.
> The implementation adopts the multi-threading style for performance 
> optimization. However, it may bring some underlying subtle bugs that will not 
> occur at the protocol level. The fix of ZOOKEEPER-3911 simply calls 
> {{logRequest(..)}} to submit the logging requests to SyncRequestProcessor's 
> queue before replying ACK-LD inside the NEWLEADER processing logic, without 
> further considering the risk of asynchronous executions by multi-threads. 
> When the follower replies ACK-LD and then crashes before its SyncThread 
> writes txns to disk, the problem is triggered.
>  
> *Property Violation:*
> From the server side, the committed log of the ensemble does not append 
> monotonically; different nodes have inconsistent committed logs. From the 
> client side, clients connected to different nodes may have inconsistent 
> views. A client may read stale data after a newer version is obtained. That 
> newer version can on

[jira] [Updated] (ZOOKEEPER-4643) Committed txns may be improperly truncated if follower crashes right after updating currentEpoch but before persisting txns to disk

2024-04-13 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4643:
--
Affects Version/s: 3.9.0

> Committed txns may be improperly truncated if follower crashes right after 
> updating currentEpoch but before persisting txns to disk
> ---
>
> Key: ZOOKEEPER-4643
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4643
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.6.3, 3.7.0, 3.8.0, 3.7.1, 3.9.0, 3.8.1, 3.9.1
>Reporter: Sirius
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Trace-ZK-4643.pdf
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When a follower is processing the NEWLEADER message in SYNC phase, it will 
> update its {{_currentEpoch_}} to the file *before* writing the txns (from the 
> PROPOSALs sent by leader in SYNC) to the log file. Such execution order may 
> lead to improper truncation of *committed* txns on other servers in later 
> rounds.
> The critical step to trigger this problem is to make a follower node crash 
> right after it updates its {{_currentEpoch_}} to the file but before writing 
> the txns to the log file. The potential risk is that, this node with 
> incomplete committed txns might be later elected as the leader with its 
> larger {{{}_currentEpoch_{}}}, and then improperly uses TRUNC to ask other 
> nodes to truncate their committed txns!
>  
> h2. Trace
> [^Trace-ZK-4643.pdf]
> Here is an example to trigger the bug. (Focus on {{_currentEpoch_}} and 
> {{{}_lastLoggedZxid_{}}})
> {*}Round 1 (Running nodes with their acceptedEpoch & currentEpoch set to 
> 1{*}{*}):{*}
>  - Start the ensemble with three nodes: S{+}0{+}, +S1+ & {+}S2{+}.
>  - +S2+ is elected leader.
>  - For all of them, _{{currentEpoch}}_ = 1, {{_lastLoggedZxid_}} (the last 
> zxid in the log)= <1, 3>, {{_lastProcessedZxid_}} = <1, 3>.
>  - +S0+ crashes.
>  - A new txn <1, 4> is logged and committed by +S1+ & {+}S2{+}. Then, +S1+ & 
> +S2+ have {{_lastLoggedZxid_}} = <1, 4>, {{_lastProcessedZxid_}} = <1, 4> .
>  - Verify clients can read the datatree with latest zxid <1, 4>. 
> *Round 2* {*}(Running nodes with their acceptedEpoch & currentEpoch set to 
> 2{*}{*}){*}{*}:{*}
>  * +S0+ & +S2+ restart, and +S1+ crashes.
>  * Again, +S2+ is elected leader.
>  * Then, during the SYNC phase, the leader +S2+ ({{{}_maxCommittedLog_{}}} = 
> <1, 4>) uses DIFF to sync with the follower +S0+ ({{{}_lastLoggedZxid_{}}} = 
> <1, 3>), and their {{_currentEpoch_}} will be set to 2 (and written to disk).
>  * ( Note that the follower +S0+ updates its currentEpoch file before writing 
> the txns to the log file when receiving NEWLEADER message. )
>  * *Unfortunately, right after the follower +S0+ finishes updating its 
> currentEpoch file, it crashes.*
> *Round 3* {*}(Running nodes with their acceptedEpoch & currentEpoch set to 
> 3{*}{*}){*}{*}:{*}
>  * +S0+ & +S1+ restart, and +S2+ crashes.
>  * Since +S0+ has {{_currentEpoch_}} = 2, +S1+ has {{_currentEpoch_}} = 1, 
> +S0+ will be elected leader.
>  * During the SYNC phase, the leader +S0+ ({{{}_maxCommittedLog_{}}} = <1, 
> 3>) will use TRUNC to sync with +S1+ ({{{}_lastLoggedZxid_{}}} = <1, 4>). 
> Then, +S1+ removes txn <1, 4>.  
>  * ( However, <1, 4> was committed and visible by clients before, and is not 
> supposed to be truncated! )
>  * Verify clients of +S0+ & +S1+ do NOT have the view of txn <1, 4>, a 
> violation of ZAB.
>  
> Extra note: The trace can be constructed with quorum nodes alive at any 
> moment with careful time tuning of node crash & restart, e.g., let +S1+ 
> restart before +S0+ crashes at the end of Round 2.
>  
> h2. Analysis
> *Root Cause:*
> When a follower updates its current epoch, it should guarantee that it has 
> already synced the uncommitted txns to the disk (or, taken snapshot). 
> Otherwise, after the current epoch is updated to the file but the history 
> (transaction log) of the follower is not updated yet, environment failures 
> might prevent the latter from going on smoothly. It is dangerous for a node 
> with updated current epoch but stale history to be elected leader. It might 
> truncate committed txns on other nodes.
>  
> *Property Violation:*
>  * From the server side, the ensemble deletes a committed txn, which is not 

[jira] [Updated] (ZOOKEEPER-4643) Committed txns may be improperly truncated if follower crashes right after updating currentEpoch but before persisting txns to disk

2024-04-13 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4643:
--
Affects Version/s: 3.9.1

> Committed txns may be improperly truncated if follower crashes right after 
> updating currentEpoch but before persisting txns to disk
> ---
>
> Key: ZOOKEEPER-4643
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4643
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.6.3, 3.7.0, 3.8.0, 3.7.1, 3.8.1, 3.9.1
>Reporter: Sirius
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Trace-ZK-4643.pdf
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When a follower is processing the NEWLEADER message in SYNC phase, it will 
> update its {{_currentEpoch_}} to the file *before* writing the txns (from the 
> PROPOSALs sent by leader in SYNC) to the log file. Such execution order may 
> lead to improper truncation of *committed* txns on other servers in later 
> rounds.
> The critical step to trigger this problem is to make a follower node crash 
> right after it updates its {{_currentEpoch_}} to the file but before writing 
> the txns to the log file. The potential risk is that, this node with 
> incomplete committed txns might be later elected as the leader with its 
> larger {{{}_currentEpoch_{}}}, and then improperly uses TRUNC to ask other 
> nodes to truncate their committed txns!
>  
> h2. Trace
> [^Trace-ZK-4643.pdf]
> Here is an example to trigger the bug. (Focus on {{_currentEpoch_}} and 
> {{{}_lastLoggedZxid_{}}})
> {*}Round 1 (Running nodes with their acceptedEpoch & currentEpoch set to 
> 1{*}{*}):{*}
>  - Start the ensemble with three nodes: S{+}0{+}, +S1+ & {+}S2{+}.
>  - +S2+ is elected leader.
>  - For all of them, _{{currentEpoch}}_ = 1, {{_lastLoggedZxid_}} (the last 
> zxid in the log)= <1, 3>, {{_lastProcessedZxid_}} = <1, 3>.
>  - +S0+ crashes.
>  - A new txn <1, 4> is logged and committed by +S1+ & {+}S2{+}. Then, +S1+ & 
> +S2+ have {{_lastLoggedZxid_}} = <1, 4>, {{_lastProcessedZxid_}} = <1, 4> .
>  - Verify clients can read the datatree with latest zxid <1, 4>. 
> *Round 2* {*}(Running nodes with their acceptedEpoch & currentEpoch set to 
> 2{*}{*}){*}{*}:{*}
>  * +S0+ & +S2+ restart, and +S1+ crashes.
>  * Again, +S2+ is elected leader.
>  * Then, during the SYNC phase, the leader +S2+ ({{{}_maxCommittedLog_{}}} = 
> <1, 4>) uses DIFF to sync with the follower +S0+ ({{{}_lastLoggedZxid_{}}} = 
> <1, 3>), and their {{_currentEpoch_}} will be set to 2 (and written to disk).
>  * ( Note that the follower +S0+ updates its currentEpoch file before writing 
> the txns to the log file when receiving NEWLEADER message. )
>  * *Unfortunately, right after the follower +S0+ finishes updating its 
> currentEpoch file, it crashes.*
> *Round 3* {*}(Running nodes with their acceptedEpoch & currentEpoch set to 
> 3{*}{*}){*}{*}:{*}
>  * +S0+ & +S1+ restart, and +S2+ crashes.
>  * Since +S0+ has {{_currentEpoch_}} = 2, +S1+ has {{_currentEpoch_}} = 1, 
> +S0+ will be elected leader.
>  * During the SYNC phase, the leader +S0+ ({{{}_maxCommittedLog_{}}} = <1, 
> 3>) will use TRUNC to sync with +S1+ ({{{}_lastLoggedZxid_{}}} = <1, 4>). 
> Then, +S1+ removes txn <1, 4>.  
>  * ( However, <1, 4> was committed and visible by clients before, and is not 
> supposed to be truncated! )
>  * Verify clients of +S0+ & +S1+ do NOT have the view of txn <1, 4>, a 
> violation of ZAB.
>  
> Extra note: The trace can be constructed with quorum nodes alive at any 
> moment with careful time tuning of node crash & restart, e.g., let +S1+ 
> restart before +S0+ crashes at the end of Round 2.
>  
> h2. Analysis
> *Root Cause:*
> When a follower updates its current epoch, it should guarantee that it has 
> already synced the uncommitted txns to the disk (or, taken snapshot). 
> Otherwise, after the current epoch is updated to the file but the history 
> (transaction log) of the follower is not updated yet, environment failures 
> might prevent the latter from going on smoothly. It is dangerous for a node 
> with updated current epoch but stale history to be elected leader. It might 
> truncate committed txns on other nodes.
>  
> *Property Violation:*
>  * From the server side, the ensemble deletes a committed txn, which is not 
> a

[jira] [Created] (ZOOKEEPER-4825) CVE-2023-6378 is present in the current logback version (1.2.13) and hence we need to upgrade to 1.4.12

2024-04-11 Thread Bhavya hoda (Jira)
Bhavya hoda created ZOOKEEPER-4825:
--

 Summary: CVE-2023-6378 is present in the current logback version 
(1.2.13) and hence we need to upgrade to 1.4.12
 Key: ZOOKEEPER-4825
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4825
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Bhavya hoda






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4824) fix CVE-2024-29025 in netty package

2024-04-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4824:
--
Labels: pull-request-available  (was: )

> fix CVE-2024-29025 in netty package
> ---
>
> Key: ZOOKEEPER-4824
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4824
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Nikita Pande
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [CVE-2024-29025|https://github.com/advisories/GHSA-5jpm-x58v-624v] is the CVE 
> for all netty-codec-http <  4.1.108.Final



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4824) fix CVE-2024-29025 in netty package

2024-04-10 Thread Nikita Pande (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835762#comment-17835762
 ] 

Nikita Pande commented on ZOOKEEPER-4824:
-

Current version is 4.1.105.Final.
Please assign this Jira to me.

> fix CVE-2024-29025 in netty package
> ---
>
> Key: ZOOKEEPER-4824
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4824
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Nikita Pande
>Priority: Major
>
> [CVE-2024-29025|https://github.com/advisories/GHSA-5jpm-x58v-624v] is the CVE 
> for all netty-codec-http <  4.1.108.Final



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4824) fix CVE-2024-29025 in netty package

2024-04-10 Thread Nikita Pande (Jira)
Nikita Pande created ZOOKEEPER-4824:
---

 Summary: fix CVE-2024-29025 in netty package
 Key: ZOOKEEPER-4824
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4824
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Nikita Pande


[CVE-2024-29025|https://github.com/advisories/GHSA-5jpm-x58v-624v] is the CVE 
for all netty-codec-http <  4.1.108.Final



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-04-09 Thread Piotr Karwasz (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835538#comment-17835538
 ] 

Piotr Karwasz commented on ZOOKEEPER-4820:
--

In this case I find both the {{optional}} attribute and the {{provided}} scope 
inappropriate for the {{zookeeper}} artifact.

If Zookeeper were to have an {{optional}} dependency on Logback, it might 
suggest that some part of Zookeeper uses Logback *directly*, e.g. to manage and 
change log levels. AFAIK it is not the case.

If Zookeeper were to have a dependency on Logback in the {{provided}} scope, I 
would interpret it as a strict requirement to have Logback on the runtime 
classpath. AFAIK any other logging backend will work as well.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4820:
--
Labels: pull-request-available  (was: )

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more precise and practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0 
(wiki)|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in several 
aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} and make it more precise and 
practical to guide the code implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0(wiki 
page)|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in several 
aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requir

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more precise and practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0(wiki 
page)|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in several 
aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} and make it more precise and 
practical to guide the code implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atom

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more precise and practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it 
more precise and practical for the implementation  (was: Proposal: Update the 
wiki / design of Zab 1.0 (Phase 2) to make it more practical for the 
implementation)

> Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more 
> precise and practical for the implementation
> 
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} and make it more precise 
> and practical to guide the code implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it 
more practical for the implementation  (was: Proposal: Update the wiki of Zab 
1.0 (Phase 2) to make it more practical for the implementation)

> Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more 
> practical for the implementation
> 
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} and make it more precise 
> and practical to guide the code implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} and make it more precise and 
practical to guide the code implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Ste

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical for the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more 
practical for the implementation  (was: Proposal: Update the wiki of Zab 1.0 
(Phase 2) to make it more precise and conform to the implementation)

> Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical for 
> the implementation
> ---
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} and make it more practical 
> to guide the code implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more 
precise and conform to the implementation  (was: Proposal: Update the wiki of 
Zab 1.0 (Phase 2) to make it more practical and conform to the implementation)

> Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and 
> conform to the implementation
> 
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} and make it more practical 
> to guide the code implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} and make it more practical to 
guide the code implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more 
practical and conform to the implementation  (was: Proposal: Update the wiki of 
Zab 1.0 (Phase 2) to make it more precise and conform to the implementation)

> Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more practical and 
> conform to the implementation
> --
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
> implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more 
precise and conform to the implementation  (was: Proposal: Update the wiki / 
design of Zab 1.0 (Phase 2) to make it more precise and conform to the 
implementation)

> Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and 
> conform to the implementation
> 
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
> implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Summary: Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it 
more precise and conform to the implementation  (was: Proposal: Update the wiki 
of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation)

> Proposal: Update the wiki / design of Zab 1.0 (Phase 2) to make it more 
> precise and conform to the implementation
> -
>
> Key: ZOOKEEPER-4823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Sirius
>Priority: Major
>
> As ZooKeeper evolves these years, its code implementation deviates the design 
> of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
> several aspects.
> One critical deviation lies in the _atomic actions_ upon a follower receives 
> NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower 
> " _*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = 
> {_}e{_}". However, the atomicity is not guaranteed with the current code 
> implementation. Asynchronous logging and committing by multi-threads with 
> node crash can interrupt this process and lead to possible data loss (see 
> {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).
> On the other hand, to implement atomicity is expensive and affecting 
> performance. It is reasonable to adopt an implementation without requiring 
> atomic updates in this step. It is highly recommended to update the design of 
> Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
> implementation.
> h3. Update Step 2.{*}f{*} by removing the requirement of atomicity
> Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
> of atomicity requirement.
> h4. Phase 2: Sync with followers
>  # *l* ...
>  # *f* The follower syncs with the leader, but doesn't modify its state until 
> it receives the NEWLEADER({_}e{_}) packet. Once it receives 
> NEWLEADER({_}e{_}), -_it atomically applies the new state, and then sets 
> f.currentEpoch = e. It then sends ACK(e << 32)._- *it executes the following 
> actions sequentially:*
>  * *2.1. applies the new state;*
>  * *2.2. sets f.currentEpoch = e;*
>  * *2.3. sends ACK(e << 32).*
>       3. *l* ...
>  
> Note:
>  * To ensure the correctness without requiring atomicity, the follower must 
> persist and sync the data before it updates its currentEpoch and replies 
> NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)
>  * This new design conforms to the code implementation in current latest code 
> version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
> issues that stay unresolved for a long time due to non-atomic executions in 
> Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
> ZOOKEEPER-4646 & {-}ZOOKEEPER-4785{-}. (see the code fixes in 
> [PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152]).
>  * The correctness of this new design has been verified with the TLA+ 
> specifications of Zab at different abstraction levels, including the 
> [high-level protocol 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
>  (developed based on the original [protocol 
> spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
>  & the [multi-threading-level 
> specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
>  (developed based on the original [system 
> spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
>  This spec is implemented by 
> [PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix 
> more known issues in Phase 2). In the verification, the TLC model checker 
> checks whether the new design satisfies the properties given by the Zab 
> paper. No violation is found during the checking with various configurations.
>  
> We sincerely hope that the above update of the protocol design can be 
> presented at the wiki page, and make it guide the future code implementation 
> better!
>  
> About us:
> We are a research team using TLA+ to verify the correctness of distributed 
> systems.
> Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide 

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide 

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including the 
[high-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & the [multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is implemented by 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide 

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including [High-level 
protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 & [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2). In the verification, the TLC model checker checks 
whether the new design satisfies the properties given by the Zab paper. No 
violation is found during the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide 

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- *it executes the following actions sequentially:*

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 * 
 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 * 
 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.) 

(In the verification, the TLC model checker checks whether the new design 
satisfies the properties given by the Zab paper. No violation is found during 
the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to bette

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2). The protocol requires that the follower " 
_*atomically*_ applies the new state and sets {*}f{*}.currentEpoch = {_}e{_}". 
However, the atomicity is not guaranteed with the current code implementation. 
Asynchronous logging and committing by multi-threads with node crash can 
interrupt this process and lead to possible data loss (see 
{-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- it executes the following actions sequentially:

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 * 
 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 * 
 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.) 

(In the verification, the TLC model checker checks whether the new design 
satisfies the properties given by the Zab paper. No violation is found during 
the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better gui

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- it executes the following actions sequentially:

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 * 
 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 * 
 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.) 

(In the verification, the TLC model checker checks whether the new design 
satisfies the properties given by the Zab paper. No violation is found during 
the checking with various configurations.)

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to bette

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- it executes the following actions sequentially:

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 * 
 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 * 
 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.)
 ** Note: In the verification, the TLC model checker checks whether the new 
design satisfies the properties given by the Zab paper. No violation is found 
during the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} 

[jira] [Updated] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sirius updated ZOOKEEPER-4823:
--
Description: 
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.{*}f{*} in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets {*}f{*}.currentEpoch = {_}e{_}". However, the atomicity is not 
guaranteed with the current code implementation. Asynchronous logging and 
committing by multi-threads with node crash can interrupt this process and lead 
to possible data loss (see {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, 
ZOOKEEPER-4646, {-}ZOOKEEPER-4785{-}).

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.{*}f{*} to better guide the code 
implementation.
h3. Update Step 2.{*}f{*} by removing the requirement of atomicity

Here provides a possible design of Step 2.{*}f{*} in Phase 2 with the removal 
of atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...
 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER({_}e{_}) packet. Once it receives NEWLEADER({_}e{_}), 
-_it atomically applies the new state, and then sets f.currentEpoch = e. It 
then sends ACK(e << 32)._- it executes the following actions sequentially:

 * *2.1. applies the new state;*
 * *2.2. sets f.currentEpoch = e;*
 * *2.3. sends ACK(e << 32).*

      3. *l* ...

 

Note:
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.{*}f{*} , including {-}ZOOKEEPER-3911{-}, ZOOKEEPER-4643, ZOOKEEPER-4646 
& {-}ZOOKEEPER-4785{-}. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]).

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 * 
 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 * 
 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.) 

In the verification, the TLC model checker checks whether the new design 
satisfies the properties given by the Zab paper. No violation is found during 
the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us:

We are a research team using TLA+ to verify the correctness of distributed 
systems.

Looking forward to receiving feedback from the ZooKeeper community!

  was:
As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.*f* in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets *f*.currentEpoch = _e_". However, the atomicity is not guaranteed with 
the current code implementation. Asynchronous logging and committing by 
multi-threads with node crash can interrupt this process and lead to possible 
data loss (see -ZOOKEEPER-3911-, ZOOKEEPER-4643, ZOOKEEPER-4646, 
-ZOOKEEPER-4785-). 

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.*f* to better guide the code 
implem

[jira] [Created] (ZOOKEEPER-4823) Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it more precise and conform to the implementation

2024-04-03 Thread Sirius (Jira)
Sirius created ZOOKEEPER-4823:
-

 Summary: Proposal: Update the wiki of Zab 1.0 (Phase 2) to make it 
more precise and conform to the implementation
 Key: ZOOKEEPER-4823
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4823
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Sirius


As ZooKeeper evolves these years, its code implementation deviates the design 
of [Zab 1.0|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0] in 
several aspects.

One critical deviation lies in the _atomic actions_ upon a follower receives 
NEWLEADER (see 2.*f* in Phase 2).

The protocol requires that the follower " _*atomically*_ applies the new state 
and sets *f*.currentEpoch = _e_". However, the atomicity is not guaranteed with 
the current code implementation. Asynchronous logging and committing by 
multi-threads with node crash can interrupt this process and lead to possible 
data loss (see -ZOOKEEPER-3911-, ZOOKEEPER-4643, ZOOKEEPER-4646, 
-ZOOKEEPER-4785-). 

On the other hand, to implement atomicity is expensive and affecting 
performance. It is reasonable to adopt an implementation without requiring 
atomic updates in this step. It is highly recommended to update the design of 
Zab without requiring atomicity in Step 2.*f* to better guide the code 
implementation. 
h3. Update Step 2.*f* by removing the requirement of atomicity

Here provides a possible design of Step 2.*f* in Phase 2 with the removal of 
atomicity requirement.
h4. Phase 2: Sync with followers
 # *l* ...

 # *f* The follower syncs with the leader, but doesn't modify its state until 
it receives the NEWLEADER(_e_) packet. Once it receives NEWLEADER(_e_), -_it 
atomically applies the new state, and then sets f.currentEpoch = e. It then 
sends ACK(e << 32)._-

it executes the following actions sequentially:

*2.1. applies the new state;*

*2.2. sets f.currentEpoch = e;*

*2.3. sends ACK(e << 32).*

 # *l* ...

 

Note: 
 * To ensure the correctness without requiring atomicity, the follower must 
persist and sync the data before it updates its currentEpoch and replies 
NEWLEADER ack (See the analysis in ZOOKEEPER-4643 & ZOOKEEPER-4785)

 * This new design conforms to the code implementation in current latest code 
version (ZooKeeper v3.9.2). This code version has fixed the known data loss 
issues that stay unresolved for a long time due to non-atomic executions in 
Step 2.*f* , including -ZOOKEEPER-3911-, ZOOKEEPER-4643, ZOOKEEPER-4646 & 
-ZOOKEEPER-4785-. (see the code fixes in 
[PR-2111|https://github.com/apache/zookeeper/pull/2111] & 
[PR-2152|https://github.com/apache/zookeeper/pull/2152]). 

 * The correctness of this new design has been verified with the TLA+ 
specifications of Zab at different abstraction levels, including

 ** [High-level protocol 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/Zab_new.tla]
 (developed based on the original [protocol 
spec|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/protocol-spec/Zab.tla])
 

 ** [Multi-threading-level 
specification|https://github.com/AlphaCanisMajoris/zookeeper-tla-spec/blob/main/zk_pr_2152.tla]
 (developed based on the original [system 
spec.|https://github.com/apache/zookeeper/blob/master/zookeeper-specifications/system-spec/zk-3.7/ZkV3_7_0.tla]
 This spec is corresponding to 
[PR-2152|https://github.com/apache/zookeeper/pull/2152], an effort to fix more 
known issues in Phase 2.) 

In the verification, the TLC model checker checks whether the new design 
satisfies the properties given by the Zab paper. No violation is found during 
the checking with various configurations.

 

We sincerely hope that the above update of the protocol design can be presented 
at the wiki page, and make it guide the future code implementation better!

 

About us: 

We are a research team using TLA+ to verify the correctness of distributed 
systems. 

Looking forward to receiving feedback from the ZooKeeper community!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4822) Quorum TLS - Enable member authorization based on certificate CN

2024-03-29 Thread Damien Diederen (Jira)
Damien Diederen created ZOOKEEPER-4822:
--

 Summary: Quorum TLS - Enable member authorization based on 
certificate CN
 Key: ZOOKEEPER-4822
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4822
 Project: ZooKeeper
  Issue Type: New Feature
  Components: server
Reporter: Damien Diederen
Assignee: Damien Diederen


Quorum TLS enables mutual authentication of quorum members.

Member authorization, however, cannot be configured on the basis of the 
presented principal CN; a round of SASL authentication has to be performed on 
top of the secured connection.

This ticket is about enabling authorization based on trusted client 
certificates.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4821) ConnectRequest got NOTREADONLY ReplyHeader

2024-03-28 Thread Kezhu Wang (Jira)
Kezhu Wang created ZOOKEEPER-4821:
-

 Summary: ConnectRequest got NOTREADONLY ReplyHeader
 Key: ZOOKEEPER-4821
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4821
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client, server
Affects Versions: 3.9.2, 3.8.4
Reporter: Kezhu Wang


I would expect {{ConnectRequest}} has two kinds of response in normal 
conditions: {{ConnectResponse}} and socket close. But if sever was configured 
with {{readonlymode.enabled}} but not {{localSessionsEnabled}}, then client 
could get {{NOTREADONLY}} in reply to {{ConnectRequest}}. I saw, at least, no 
handling in java client. And, I encountered this in writing tests for rust 
client.

It guess it is not by design. And we probably could close the socket in early 
phase. But also, it could be solved in client sides as 
{{sizeof(ConnectResponse)}} is larger than {{sizeof(ReplyHeader)}}. Then, we 
gain ability to carry error for {{ConnectRequest}} while {{ConnectResponse}} 
does not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread Christopher Tubbs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831487#comment-17831487
 ] 

Christopher Tubbs edited comment on ZOOKEEPER-4820 at 3/27/24 5:57 PM:
---

Provided scope may achieve the same result for dependency resolution for 
dependent projects. However, they mean very different things. Provided means 
it's required, but not expected to be bundled because it is expected to be 
provided by the destination environment. In this case, we want the opposite: 
it's not required, but is expected to be bundled. However, there may be times 
where you need to use provided instead of using optional, because you need it 
to be handled in a specific way by a Maven plugin during the build, because 
different plugins do different things with these. For this, the correct way of 
doing it that communicates the relevant intent is to mark it optional and keep 
the scope as runtime. However, if provided is needed instead, to make a 
particular plugin work, you should understand that this may cause other 
problems downstream, and at the very least, you should add a comment in the POM 
explaining why provided is used instead, so the intention is clear.

For what it's worth, I think provided dependencies will be excluded by the 
maven-assembly-plugin, which is not what you want (but may not matter if the 
plugin is declared explicitly in the assembly descriptor). I'd stick with 
optional if it works.


was (Author: ctubbsii):
Provided scope may achieve the same result for dependency resolution for 
dependent projects. However, they mean very different things. Provided means 
it's required, but not expected to be bundled because it is expected to be 
provided by the destination environment. In this case, we want the opposite: 
it's not required, but is expected to be bundled. However, there may be times 
where you need to use provided instead of using optional, because you need it 
to be handled in a specific way by a Maven plugin during the build, because 
different plugins do different things with these. For this, the correct way of 
doing it that communicates the relevant intent is to mark it optional and keep 
the scope as runtime. However, if provided is needed instead, to make a 
particular plugin work, you should understand that this may cause other 
problems downstream, and at the very least, you should add a comment in the POM 
explaining why provided is used instead, so the intention is clear.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread Christopher Tubbs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831487#comment-17831487
 ] 

Christopher Tubbs edited comment on ZOOKEEPER-4820 at 3/27/24 5:55 PM:
---

Provided scope may achieve the same result for dependency resolution for 
dependent projects. However, they mean very different things. Provided means 
it's required, but not expected to be bundled because it is expected to be 
provided by the destination environment. In this case, we want the opposite: 
it's not required, but is expected to be bundled. However, there may be times 
where you need to use provided instead of using optional, because you need it 
to be handled in a specific way by a Maven plugin during the build, because 
different plugins do different things with these. For this, the correct way of 
doing it that communicates the relevant intent is to mark it optional and keep 
the scope as runtime. However, if provided is needed instead, to make a 
particular plugin work, you should understand that this may cause other 
problems downstream, and at the very least, you should add a comment in the POM 
explaining why provided is used instead, so the intention is clear.


was (Author: ctubbsii):
Provided scope may achieve the same result for dependency resolution for 
dependent projects. However, they mean very different things. Provided means 
it's required, but not expected to be bundled because it is expected to be 
provided by the destination environment. In this case, we want the opposite: 
it's not required, but is expected to be bundled. However, there may be times 
where you need to use provided instead of using optional, because you need it 
to be handled in a specific way by a Maven plugin during the build, because 
different plugins do different things with these. For this, the correct way of 
doing it that communicates the relevant intent is to mark it optional and keep 
the scope as runtime. However, if provided is needed instead, to make a 
particular plugin work, you should be prepared to cause other problems 
downstream, and at the very least, you should add a comment in the POM 
explaining why provided is used instead, so the intention is clear.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread Christopher Tubbs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831487#comment-17831487
 ] 

Christopher Tubbs commented on ZOOKEEPER-4820:
--

Provided scope may achieve the same result for dependency resolution for 
dependent projects. However, they mean very different things. Provided means 
it's required, but not expected to be bundled because it is expected to be 
provided by the destination environment. In this case, we want the opposite: 
it's not required, but is expected to be bundled. However, there may be times 
where you need to use provided instead of using optional, because you need it 
to be handled in a specific way by a Maven plugin during the build, because 
different plugins do different things with these. For this, the correct way of 
doing it that communicates the relevant intent is to mark it optional and keep 
the scope as runtime. However, if provided is needed instead, to make a 
particular plugin work, you should be prepared to cause other problems 
downstream, and at the very least, you should add a comment in the POM 
explaining why provided is used instead, so the intention is clear.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread Christopher Tubbs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831484#comment-17831484
 ] 

Christopher Tubbs edited comment on ZOOKEEPER-4820 at 3/27/24 5:45 PM:
---

In case anybody wants to work on this, the right way to do this in Maven is to 
mark the dependency optional in addition to making it runtime:
{code:xml}
true
runtime
{code}
However, you have to be careful if you use any build automation that changing 
the dependency in this way doesn't cause it to be omitted from a distribution 
assembly, like a .tar.gz, or .zip file. This might happen if you use a 
maven-assembly-plugin assembly descriptor that bundles dependencies, but omits 
optional ones, for example. I'm not sure if the default behavior of 
[includeDependencies|https://maven.apache.org/plugins/maven-assembly-plugin/assembly.html]
 will include optional ones or not.


was (Author: ctubbsii):
In case anybody wants to work on this, the right way to do this in Maven is to 
mark the dependency optional in addition to making it runtime:

{code:xml}
true
runtime
{code}

However, you have to be careful if you use any build automation that changing 
the dependency in this way doesn't cause it to be omitted from a distribution 
assembly, like a .tar.gz, or .zip file.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread PJ Fanning (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831485#comment-17831485
 ] 

PJ Fanning commented on ZOOKEEPER-4820:
---

or `provided`

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread Christopher Tubbs (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831484#comment-17831484
 ] 

Christopher Tubbs commented on ZOOKEEPER-4820:
--

In case anybody wants to work on this, the right way to do this in Maven is to 
mark the dependency optional in addition to making it runtime:

{code:xml}
true
runtime
{code}

However, you have to be careful if you use any build automation that changing 
the dependency in this way doesn't cause it to be omitted from a distribution 
assembly, like a .tar.gz, or .zip file.

> zookeeper pom leaks logback dependency
> --
>
> Key: ZOOKEEPER-4820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
> Project: ZooKeeper
>  Issue Type: Task
>  Components: java client
>Reporter: PJ Fanning
>Priority: Major
>
> Since v3.8.0
> https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0
> It's fine that Zookeeper uses Logback on the server side - but users who want 
> to access Zookeeper using client side code also add this zookeeper jar to 
> their classpaths. When zookeeper is used as client side lib, it should 
> ideally not expose a logback dependency - just an slf4j-api jar dependency.
> Would it be possible to repwork the zookeper pom so that client side users 
> don't have to explicitly exclude logback jars? Many users will have their own 
> preferred logging framework.
> Is there another zookeeper client side jar that could be instead of 
> zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4820) zookeeper pom leaks logback dependency

2024-03-27 Thread PJ Fanning (Jira)
PJ Fanning created ZOOKEEPER-4820:
-

 Summary: zookeeper pom leaks logback dependency
 Key: ZOOKEEPER-4820
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4820
 Project: ZooKeeper
  Issue Type: Task
  Components: java client
Reporter: PJ Fanning


Since v3.8.0

https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.8.0

It's fine that Zookeeper uses Logback on the server side - but users who want 
to access Zookeeper using client side code also add this zookeeper jar to their 
classpaths. When zookeeper is used as client side lib, it should ideally not 
expose a logback dependency - just an slf4j-api jar dependency.

Would it be possible to repwork the zookeper pom so that client side users 
don't have to explicitly exclude logback jars? Many users will have their own 
preferred logging framework.

Is there another zookeeper client side jar that could be instead of 
zookeeper.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4819) Can't seek for writable tls server if connected to readonly server

2024-03-26 Thread Kezhu Wang (Jira)
Kezhu Wang created ZOOKEEPER-4819:
-

 Summary: Can't seek for writable tls server if connected to 
readonly server
 Key: ZOOKEEPER-4819
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4819
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.9.2, 3.8.4
Reporter: Kezhu Wang


{{[ClientCnxn::pingRwServer|https://github.com/apache/zookeeper/blob/d12aba599233b0fcba0b9b945ed3d2f45d4016f0/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1280]}}
 uses raw socket to issue "isro" 4lw command. This results in unsuccessful 
handshake to tls server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4805) Update cwiki page with latest changes

2024-03-26 Thread Szucs Villo (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szucs Villo resolved ZOOKEEPER-4805.

Resolution: Fixed

> Update cwiki page with latest changes
> -
>
> Key: ZOOKEEPER-4805
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4805
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Andor Molnar
>Assignee: Szucs Villo
>Priority: Major
>
> Update the following wiki page with latest changes and instructions how to 
> use the script:
> [https://cwiki.apache.org/confluence/display/ZOOKEEPER/Merging+Github+Pull+Requests]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4818) Export JVM heap metrics in ServerMetrics

2024-03-25 Thread Andrew Kyle Purtell (Jira)
Andrew Kyle Purtell created ZOOKEEPER-4818:
--

 Summary: Export JVM heap metrics in ServerMetrics
 Key: ZOOKEEPER-4818
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4818
 Project: ZooKeeper
  Issue Type: Improvement
  Components: metric system
Reporter: Andrew Kyle Purtell


A metric for JVM heap occupancy is not included in ServerMetrics.

According to [https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] 
the recommended practice is for someone to enable the PrometheusMetricsProvider 
and the Prometheus base class upon which that provider is based does export 
that information. See 
[https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] . The example 
provided for alerting on heap utilization there is:
{noformat}
  - alert: JvmMemoryFillingUp
expr: jvm_memory_bytes_used / jvm_memory_bytes_max{area="heap"} > 0.8
for: 5m
labels:
  severity: warning
annotations:
  summary: "JVM memory filling up (instance {{ $labels.instance }})"
  description: "JVM memory is filling up (> 80%)\n labels: {{ $labels }}  
value = {{ $value }}\n"
{noformat}
where {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} are provided by a 
Prometheus base class.

Where PrometheusMetricsProvider is the right choice that's good enough but 
where the ServerMetrics information is consumed in an alternate way, by 
4-letter-word scraping, or by JMX, ServerMetrics should provide the same 
information. {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} (presuming 
heap) are reasonable names. An alternative could be to calculate the heap 
occupancy and provide that as a percentage, either an integer in the range 0 - 
100 or floating point value in the range 0.0 - 1.0. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4818) Export JVM heap metrics in ServerMetrics

2024-03-25 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell updated ZOOKEEPER-4818:
---
Description: 
A metric for JVM heap occupancy is not included in ServerMetrics.

According to [https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] 
the recommended practice is for someone to enable the PrometheusMetricsProvider 
and the Prometheus base class upon which that provider is based does export 
that information. See 
[https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] . The example 
provided for alerting on heap utilization there is:
{noformat}
  - alert: JvmMemoryFillingUp
expr: jvm_memory_bytes_used / jvm_memory_bytes_max{area="heap"} > 0.8
for: 5m
labels:
  severity: warning
annotations:
  summary: "JVM memory filling up (instance {{ $labels.instance }})"
  description: "JVM memory is filling up (> 80%)\n labels: {{ $labels }}  
value = {{ $value }}\n"
{noformat}
where {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} are provided by a 
Prometheus base class.

Where PrometheusMetricsProvider is the right choice that's good enough but 
where the ServerMetrics information is consumed in an alternate way, by 
4-letter-word scraping, or by JMX, ServerMetrics should provide the same 
information. {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} (presuming 
heap) are reasonable names. An alternative could be to calculate the heap 
occupancy and provide that as a percentage, either an integer in the range 0 - 
100 or floating point value in the range 0.0 - 1.0. 

There is some precedent for exporting JVM metrics in ServerMetrics from 
ZOOKEEPER-3845 . 

  was:
A metric for JVM heap occupancy is not included in ServerMetrics.

According to [https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] 
the recommended practice is for someone to enable the PrometheusMetricsProvider 
and the Prometheus base class upon which that provider is based does export 
that information. See 
[https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] . The example 
provided for alerting on heap utilization there is:
{noformat}
  - alert: JvmMemoryFillingUp
expr: jvm_memory_bytes_used / jvm_memory_bytes_max{area="heap"} > 0.8
for: 5m
labels:
  severity: warning
annotations:
  summary: "JVM memory filling up (instance {{ $labels.instance }})"
  description: "JVM memory is filling up (> 80%)\n labels: {{ $labels }}  
value = {{ $value }}\n"
{noformat}
where {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} are provided by a 
Prometheus base class.

Where PrometheusMetricsProvider is the right choice that's good enough but 
where the ServerMetrics information is consumed in an alternate way, by 
4-letter-word scraping, or by JMX, ServerMetrics should provide the same 
information. {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} (presuming 
heap) are reasonable names. An alternative could be to calculate the heap 
occupancy and provide that as a percentage, either an integer in the range 0 - 
100 or floating point value in the range 0.0 - 1.0. 


> Export JVM heap metrics in ServerMetrics
> 
>
>     Key: ZOOKEEPER-4818
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4818
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: metric system
>Reporter: Andrew Kyle Purtell
>Priority: Minor
>
> A metric for JVM heap occupancy is not included in ServerMetrics.
> According to [https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] 
> the recommended practice is for someone to enable the 
> PrometheusMetricsProvider and the Prometheus base class upon which that 
> provider is based does export that information. See 
> [https://zookeeper.apache.org/doc/current/zookeeperMonitor.html] . The 
> example provided for alerting on heap utilization there is:
> {noformat}
>   - alert: JvmMemoryFillingUp
> expr: jvm_memory_bytes_used / jvm_memory_bytes_max{area="heap"} > 0.8
> for: 5m
> labels:
>   severity: warning
> annotations:
>   summary: "JVM memory filling up (instance {{ $labels.instance }})"
>   description: "JVM memory is filling up (> 80%)\n labels: {{ $labels }}  
> value = {{ $value }}\n"
> {noformat}
> where {{jvm_memory_bytes_used}} and {{jvm_memory_bytes_max}} are provided by 
> a Prometheus base class.
> Where PrometheusMetricsProvider is the right choice that's good enough but 
> where the ServerMetrics information is consumed in an alternate way, by 
> 4-letter-word scraping, or by JMX, ServerMetri

[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node3-60.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log, node1-60.log, node2-25.log, node2-60.log, 
> node3-25.log, node3-60.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node2-25.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log, node1-60.log, node2-25.log, node2-60.log, 
> node3-25.log, node3-60.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node1-60.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log, node1-60.log, node2-25.log, node2-60.log, 
> node3-25.log, node3-60.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node3-25.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log, node1-60.log, node2-25.log, node2-60.log, 
> node3-25.log, node3-60.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node2-60.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log, node1-60.log, node2-25.log, node2-60.log, 
> node3-25.log, node3-60.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4817:

Attachment: node1-25.log

> CancelledKeyException does not work in some cases.
> --
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Major
> Attachments: node1-25.log
>
>
> If the client connection is disconnected with zoo server, 
> cancelledkeyexception will arise.
> Here is a strange scenarios.
> NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.
> If the delay lasts more than 30s, cancelledkeyexception will disappear.
> If the delay lasts for 25s, cancelledkeyexception will arise.
> When the doIO encounters the slowdown caused by teh fail-slow nic, the 
> context is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4817) CancelledKeyException does not work in some cases.

2024-03-22 Thread gendong1 (Jira)
gendong1 created ZOOKEEPER-4817:
---

 Summary: CancelledKeyException does not work in some cases.
 Key: ZOOKEEPER-4817
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.10.0
Reporter: gendong1


If the client connection is disconnected with zoo server, cancelledkeyexception 
will arise.

Here is a strange scenarios.

NIOServerCxn.doIO is blocked at line 333 by the fail-slow nic.

If the delay lasts more than 30s, cancelledkeyexception will disappear.

If the delay lasts for 25s, cancelledkeyexception will arise.

When the doIO encounters the slowdown caused by teh fail-slow nic, the context 
is same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ZOOKEEPER-4444) Follower doesn't get synchronized after process restart

2024-03-22 Thread Jimmy LIN (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829769#comment-17829769
 ] 

Jimmy LIN edited comment on ZOOKEEPER- at 3/22/24 7:36 AM:
---

This issue happened in my environment, then zookeeper follower data was 
inconsistent with leader's. znode was deleted, but could be read from one 
follower.

My zookeeper version is 3.7.1
 
 2023-09-22 06:14:59,309 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):Learner@737][]
 - Learner received NEWLEADER message
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):QuorumPeer@921][]
 - Peer state changed: following - synchronization
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):CommitProcessor@491][]
 - Configuring CommitProcessor with readBatchSize -1 commitBatchSize 1
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):CommitProcessor@452][]
 - Configuring CommitProcessor with 28 worker threads.
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):FollowerRequestProcessor@59][]
 - Initialized FollowerRequestProcessor with 
zookeeper.follower.skipLearnerRequestToNextProcessor as false
 2023-09-22 06:14:59,311 [myid:3] - WARN  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):MBeanRegistry@110][]
 - Failed to register MBean InMemoryDataTree
 2023-09-22 06:14:59,311 [myid:3] - WARN  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):LearnerZooKeeperServer@104][]
 - Failed to register with JMX
 
org.apache.ZooKeeperService:name0=ReplicatedServer_id3,name1=replica.3,name2=Follower,name3=InMemoryDataTree
    at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
    at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
    at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:106)
    at 
org.apache.zookeeper.server.quorum.LearnerZooKeeperServer.registerJMX(LearnerZooKeeperServer.java:102)
    at 
org.apache.zookeeper.server.ZooKeeperServer.startupWithServerState(ZooKeeperServer.java:715)
    at 
org.apache.zookeeper.server.ZooKeeperServer.startupWithoutServing(ZooKeeperServer.java:698)
    at 
org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:760)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:108)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1522)
 2023-09-22 06:15:00,292 [myid:3] - WARN  
[NIOWorkerThread-18:NIOServerCnxn@380][] - Close of session 0x0
 ZooKeeperServer not running
    at 
org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:554)
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339)
    at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
    at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748) // 30065370041
 2023-09-22 06:15:00,292 [myid:3] - WARN  [SyncThread:3:FileTxnLog@275][] - 
Current zxid 30065370041 is <= 30065371040 for 5
 2023-09-22 06:15:00,292 [myid:3] - WARN  
[NIOWorkerThread-10:NIOServerCnxn@380][] - Close of session 0x0
 ZooKeeperServer not running
    at 
org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:554)
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339)
    at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
    at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 2023-09-22 06:15:00,293 [myid:3] - WARN  [SyncThread:3:FileTxnLog@275][] - 
Current zxid 30065370042 is <= 30065371040 for 1
 2023-09-22 06:15:00,293 [myid:3] - WARN  [SyncThread:3:FileTxnL

[jira] [Commented] (ZOOKEEPER-4444) Follower doesn't get synchronized after process restart

2024-03-22 Thread LINFUSHOU (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829769#comment-17829769
 ] 

LINFUSHOU commented on ZOOKEEPER-:
--

This issue happened in my environment, then zookeeper follower data was 
inconsistent with leader. znode was deleted, but could be read from one 
follower.
 
 2023-09-22 06:14:59,309 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):Learner@737][]
 - Learner received NEWLEADER message
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):QuorumPeer@921][]
 - Peer state changed: following - synchronization
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):CommitProcessor@491][]
 - Configuring CommitProcessor with readBatchSize -1 commitBatchSize 1
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):CommitProcessor@452][]
 - Configuring CommitProcessor with 28 worker threads.
 2023-09-22 06:14:59,310 [myid:3] - INFO  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):FollowerRequestProcessor@59][]
 - Initialized FollowerRequestProcessor with 
zookeeper.follower.skipLearnerRequestToNextProcessor as false
 2023-09-22 06:14:59,311 [myid:3] - WARN  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):MBeanRegistry@110][]
 - Failed to register MBean InMemoryDataTree
 2023-09-22 06:14:59,311 [myid:3] - WARN  
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:9639)(secure=disabled):LearnerZooKeeperServer@104][]
 - Failed to register with JMX
 
org.apache.ZooKeeperService:name0=ReplicatedServer_id3,name1=replica.3,name2=Follower,name3=InMemoryDataTree
    at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
    at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
    at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
    at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:106)
    at 
org.apache.zookeeper.server.quorum.LearnerZooKeeperServer.registerJMX(LearnerZooKeeperServer.java:102)
    at 
org.apache.zookeeper.server.ZooKeeperServer.startupWithServerState(ZooKeeperServer.java:715)
    at 
org.apache.zookeeper.server.ZooKeeperServer.startupWithoutServing(ZooKeeperServer.java:698)
    at 
org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:760)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:108)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1522)
 2023-09-22 06:15:00,292 [myid:3] - WARN  
[NIOWorkerThread-18:NIOServerCnxn@380][] - Close of session 0x0
 ZooKeeperServer not running
    at 
org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:554)
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339)
    at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
    at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748) // 30065370041
 2023-09-22 06:15:00,292 [myid:3] - WARN  [SyncThread:3:FileTxnLog@275][] - 
Current zxid 30065370041 is <= 30065371040 for 5
 2023-09-22 06:15:00,292 [myid:3] - WARN  
[NIOWorkerThread-10:NIOServerCnxn@380][] - Close of session 0x0
 ZooKeeperServer not running
    at 
org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:554)
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339)
    at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
    at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 2023-09-22 06:15:00,293 [myid:3] - WARN  [SyncThread:3:FileTxnLog@275][] - 
Current zxid 30065370042 is <= 30065371040 for 1
 2023-09-22 06:15:00,293 [myid:3] - WARN  [SyncThread:3:FileTxnLog@275][] - 
Current zxid 30065370043 is <= 30065371040 for 1
 2023-09-22 06:15:00,

[jira] [Comment Edited] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829395#comment-17829395
 ] 

Xin Chen edited comment on ZOOKEEPER-3975 at 3/21/24 6:56 AM:
--

I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2024-03-18 15:28:52,005 [myid:2] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
... 4 more
{code}
I tried to individually parse and read the transaction log log.300a8108e0 using 
“zkTxnLogToolkit.sh -d ./log.300a8108e0 and found that the error occurred while 
loading a specific transaction record in the middle of the log 
({*}approximately the 38448th record{*}). This record is not the last record in 
the transaction log, which {*}consists of over 90,000 records in total{*}. 
Since I couldn't parse it, I couldn't determine what caused this particular 
transaction record to become corrupted. I'm puzzled as to why the corruption 
occurred in the middle of the transaction log instead of at the end. Normally, 
if there were abnormal events such as power outage or disk full, the last line 
of the transaction log could be affected. So, I'm wondering what circumstances 
could lead to corruption in the middle of the transaction log, resulting in 
this error. The ZooKeeper process is unable to recover automatically and keeps 
restarting due to this IOException. Indeed, this is a very serious issue.

Who has experience with this situation?


was (Author: JIRAUSER298666):
I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength

[jira] [Comment Edited] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829395#comment-17829395
 ] 

Xin Chen edited comment on ZOOKEEPER-3975 at 3/21/24 6:42 AM:
--

I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2024-03-18 15:28:52,005 [myid:2] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
... 4 more
{code}
I tried to individually parse and read the transaction log log.300a8108e0 using 
“zkTxnLogToolkit.sh -d ./log.300a8108e0 and found that the error occurred while 
loading a specific transaction record in the middle of the log 
({*}approximately the 38448th record{*}). This record is not the last record in 
the transaction log, which {*}consists of over 90,000 records in total{*}. 
Since I couldn't parse it, I couldn't determine what caused this particular 
transaction record to become corrupted. I'm puzzled as to why the corruption 
occurred in the middle of the transaction log instead of at the end. Normally, 
if there were abnormal events such as power outage or disk full, the last line 
of the transaction log could be affected. So, I'm wondering what circumstances 
could lead to corruption in the middle of the transaction log, resulting in 
this error. The ZooKeeper process is unable to recover automatically and keeps 
restarting due to this IOException. Indeed, this is a very serious issue.

How can I get the help?


was (Author: JIRAUSER298666):
I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength

[jira] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ https://issues.apache.org/jira/browse/ZOOKEEPER-3975 ]


Xin Chen deleted comment on ZOOKEEPER-3975:
-

was (Author: JIRAUSER298666):
log

> Zookeeper crashes: Unable to load database on disk java.io.IOException: 
> Unreasonable length
> ---
>
> Key: ZOOKEEPER-3975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3975
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jute
>Affects Versions: 3.6.2
> Environment: Debian 10 x64
> openjdk version "11.0.8" 2020-07-14
> OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)
> OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed mode, 
> sharing)
>Reporter: Diego Lucas Jiménez
>Priority: Critical
>
> After running for a while, the entire cluster (3 zookeeper) crash suddenly, 
> all of them logging:
>  
> {code:java}
> 2020-10-16 10:37:00,459 [myid:2] - WARN [NIOWorkerThread-4:NIOServerCnxn@373] 
> - Close of session 0x0 java.io.IOException: ZooKeeperServer not running at 
> org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:544) 
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:332) at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
>  at 
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)
> 2020-10-16 10:37:00,475 [myid:2] - ERROR 
> [QuorumPeer[myid=2](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1139] - 
> Unable to load database on disk
> java.io.IOException: Unreasonable length = 5089607
> at 
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:166)
> at 
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:127)
> at 
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:159)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:768)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:352)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:258)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:303)
> at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:285)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1093)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:1249)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:868)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:941)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1428){code}
> Apparently the "corrupted" file appears in all the servers, so no solution 
> such as "removing version-2 on the faulty server and letting replicate from a 
> healthy one" :(.
> The entire cluster goes down, we have downtime, every-single-day since we 
> upgraded from 3.4.9. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829396#comment-17829396
 ] 

Xin Chen commented on ZOOKEEPER-3975:
-

log

> Zookeeper crashes: Unable to load database on disk java.io.IOException: 
> Unreasonable length
> ---
>
> Key: ZOOKEEPER-3975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3975
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jute
>Affects Versions: 3.6.2
> Environment: Debian 10 x64
> openjdk version "11.0.8" 2020-07-14
> OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)
> OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed mode, 
> sharing)
>Reporter: Diego Lucas Jiménez
>Priority: Critical
>
> After running for a while, the entire cluster (3 zookeeper) crash suddenly, 
> all of them logging:
>  
> {code:java}
> 2020-10-16 10:37:00,459 [myid:2] - WARN [NIOWorkerThread-4:NIOServerCnxn@373] 
> - Close of session 0x0 java.io.IOException: ZooKeeperServer not running at 
> org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:544) 
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:332) at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
>  at 
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)
> 2020-10-16 10:37:00,475 [myid:2] - ERROR 
> [QuorumPeer[myid=2](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1139] - 
> Unable to load database on disk
> java.io.IOException: Unreasonable length = 5089607
> at 
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:166)
> at 
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:127)
> at 
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:159)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:768)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:352)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:258)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:303)
> at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:285)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1093)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:1249)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:868)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:941)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1428){code}
> Apparently the "corrupted" file appears in all the servers, so no solution 
> such as "removing version-2 on the faulty server and letting replicate from a 
> healthy one" :(.
> The entire cluster goes down, we have downtime, every-single-day since we 
> upgraded from 3.4.9. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829395#comment-17829395
 ] 

Xin Chen edited comment on ZOOKEEPER-3975 at 3/21/24 6:40 AM:
--

I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2024-03-18 15:28:52,005 [myid:2] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
... 4 more
{code}


I tried to individually parse and read the transaction log log.300a8108e0 using 
“zkTxnLogToolkit.sh -d ./log.300a8108e0 and found that the error occurred while 
loading a specific transaction record in the middle of the log (*approximately 
the 38448th record*). This record is not the last record in the transaction 
log, which *consists of over 90,000 records in total*. Since I couldn't parse 
it, I couldn't determine what caused this particular transaction record to 
become corrupted. I'm puzzled as to why the corruption occurred in the middle 
of the transaction log instead of at the end. Normally, if there were abnormal 
events such as power outage or disk full, the last line of the transaction log 
could be affected. So, I'm wondering what circumstances could lead to 
corruption in the middle of the transaction log, resulting in this error. The 
ZooKeeper process is unable to recover automatically and keeps restarting due 
to this IOException. Indeed, this is a very serious issue.  

How can I  get the help? 
The attachments are the logs of ZooKeeper. 



was (Author: JIRAUSER298666):
I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835

[jira] [Commented] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-21 Thread Xin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829395#comment-17829395
 ] 

Xin Chen commented on ZOOKEEPER-3975:
-

I also encountered this issue in a three-node cluster. It occurred when 
ZooKeeper was unexpectedly offline and then came back online. zk0 and zk2 were 
able to come online successfully, with zk2 being the leader. However, zk1 
encountered this error while loading a particular transaction log 
log.300a8108e0:
{code:java}
2024-03-18 15:28:52,000 [myid:2] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2024-03-18 15:28:52,005 [myid:2] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: Unreasonable length = 11807835
at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:208)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:632)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:651)
... 4 more
{code}


I tried to individually parse and read the transaction log log.300a8108e0 using 
“zkTxnLogToolkit.sh -d ./log.300a8108e0 and found that the error occurred while 
loading a specific transaction record in the middle of the log (*approximately 
the 38448th record*). This record is not the last record in the transaction 
log, which *consists of over 90,000 records in total*. Since I couldn't parse 
it, I couldn't determine what caused this particular transaction record to 
become corrupted. I'm puzzled as to why the corruption occurred in the middle 
of the transaction log instead of at the end. Normally, if there were abnormal 
events such as power outage or disk full, the last line of the transaction log 
could be affected. So, I'm wondering what circumstances could lead to 
corruption in the middle of the transaction log, resulting in this error. The 
ZooKeeper process is unable to recover automatically and keeps restarting due 
to this IOException. Indeed, this is a very serious issue.  

How can I  get the help? 
The attachments are the logs of ZooKeeper.  [^zookeeper.log.2] 


> Zookeeper crashes: Unable to load database on disk java.io.IOException: 
> Unreasonable length
> ---
>
> Key: ZOOKEEPER-3975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3975
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jute
>Affects Versions: 3.6.2
> Environment: Debian 10 x64
> openjdk version "11.0.8" 2020-07-14
> OpenJDK Runtime 

[jira] [Updated] (ZOOKEEPER-4752) Remove version files in zookeeper-server/src/main from .gitignore

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4752:

Fix Version/s: 3.8.5
   3.9.3

> Remove version files  in zookeeper-server/src/main from .gitignore
> --
>
> Key: ZOOKEEPER-4752
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4752
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.8.2
>Reporter: Istvan Toth
>Assignee: Zili Chen
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.10.0, 3.8.5, 3.9.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Info.java and VersionInfoMain.java files are currently generated into the 
> target/generated-sources directory. 
> However .gitignore still includes the following lines for the main src 
> directory.
> {noformat}
> zookeeper-server/src/main/java/org/apache/zookeeper/version/Info.java
> zookeeper-server/src/main/java/org/apache/zookeeper/version/VersionInfoMain.java
> {noformat}
> Let's remove them.
> I've just spent two hours trying to debug  mysterious build failures, which 
> were caused by an old Info.java file in src, which didn't show up in git 
> status because of those out-of-date .gitignore entries.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4804) Use daemon threads for Netty client

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4804:

Fix Version/s: 3.9.3

> Use daemon threads for Netty client
> ---
>
> Key: ZOOKEEPER-4804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4804
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.3
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.10.0, 3.9.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When the Netty client is used, the Java process hangs on System.exit if there 
> is an open Zookeeper connection.
> This is caused by the non-daemon threads created by Netty.
> Exiting without closing the connection is not a good practice, but this hang 
> does not happen with the NIO client, and I think ZK should behave the same 
> regardless of the client implementation used.
> The Netty ThreadFactory implementation is configurable, it shouldn't be too 
> hard make sure that daemon threads are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4813) Make zookeeper start successfully when the last log file is dirty during the restore progress

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4813:

Fix Version/s: 3.9.3
   (was: 3.9.2)

> Make zookeeper start successfully when the last log file is dirty during the 
> restore progress
> -
>
> Key: ZOOKEEPER-4813
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4813
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.9.1
>Reporter: Yan Zhao
>Assignee: Yan Zhao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.9.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the zookeeper restarts, it will restore the data from the last valid 
> snapshot file, and replay txn log to append data.
> But if the last log file is empty due to some reason, the restore will fail, 
> not make the zookeeper can not restart.
> The logs as followings:
> {noformat}
> 14:12:16.023 [main] INFO  org.apache.zookeeper.server.persistence.SnapStream 
> - Invalid snapshot snapshot.188700025d87. len = 761554294, byte = 45
> 14:12:16.024 [main] INFO  org.apache.zookeeper.server.persistence.FileSnap - 
> Reading snapshot /pulsar/data/zookeeper/version-2/snapshot.188700025a05
> 14:12:17.350 [main] INFO  org.apache.zookeeper.server.DataTree - The digest 
> in the snapshot has digest version of 2, with zxid as 0x188700025b07, and 
> digest value as 510776662607117
> 14:12:17.492 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeer - 
> Unable to load database on disk
> java.io.EOFException: null
>   at java.io.DataInputStream.readInt(DataInputStream.java:386) ~[?:?]
>   at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96) 
> ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:67)
>  ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:725)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:743)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:711)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:792)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:361)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:267)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:312)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:288) 
> ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1149)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1135) 
> ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91)
>  ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1]
> 14:12:17.502 [main] INFO  
> org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider - Shutdown 
> executor service with timeout 1000
> 14:12:17.508 [main] INFO  org.eclipse.jetty.server.AbstractConnector - 
> Stopped ServerConnector@2484f433{HTTP/1.1, (http/1.1)}{0.0.0.0:8000}
> 14:12:17.510 [main] INFO  org.eclipse.jetty.server.handler.ContextHandler - 
> Stopped o.e.j.s.ServletContextHandler@59a67c3a{/,null,STOPPED}
> 14:12:1

[jira] [Updated] (ZOOKEEPER-4807) Add sid for the leader goodbyte log

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4807:

Fix Version/s: 3.9.3
   (was: 3.9.2)

> Add sid for the leader goodbyte log
> ---
>
> Key: ZOOKEEPER-4807
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4807
> Project: ZooKeeper
>  Issue Type: Wish
>  Components: server
>Affects Versions: 3.9.1
>Reporter: Yan Zhao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.9.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When a follower disconnects with the leader, the leader will print the remote 
> address.
> But if the zookeeper is along with istio, the remote address is not right.
> 2024-02-05T03:23:54,967+ [LearnerHandler-/127.0.0.6:56085] WARN  
> org.apache.zookeeper.server.quorum.LearnerHandler - *** GOODBYE 
> /127.0.0.6:56085 
> We would better print the sid in the goodbye log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4801) Add memory size limitation policy for ZkDataBase#committedLog

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4801:

Fix Version/s: 3.9.3
   (was: 3.9.2)

> Add memory size limitation policy for ZkDataBase#committedLog
> -
>
> Key: ZOOKEEPER-4801
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4801
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.9.1
>Reporter: Yan Zhao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.9.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ZkDataBase support commit log count to limit the memory, which is not 
> precise, some request payloads may be huge, it will cost lots of heap memory.
> So support payload size limitation will be better.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4785) Txn loss due to race condition in Learner.syncWithLeader() during DIFF sync

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar resolved ZOOKEEPER-4785.
-
Fix Version/s: 3.8.4
   Resolution: Fixed

> Txn loss due to race condition in Learner.syncWithLeader() during DIFF sync
> ---
>
> Key: ZOOKEEPER-4785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4785
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.8.0, 3.7.1, 3.8.1, 3.7.2, 3.8.2, 3.9.1
>Reporter: Li Wang
>Assignee: Li Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.9.2, 3.10, 3.8.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We had txn loss incident in production recently. After investigation, we 
> found it was caused by the race condition of follower writing the current 
> epoch and sending the ACK_LD before successfully persisting all the txns from 
> DIFF sync in Learner.syncWithLeader() method.
> {code:java}
> case Leader.NEWLEADER: 
> ...
> self.setCurrentEpoch(newEpoch);
> writeToTxnLog = true;
> //Anything after this needs to go to the transaction log, not applied 
> directly in memory
> isPreZAB1_0 = false;
> // ZOOKEEPER-3911: make sure sync the uncommitted logs before commit 
> them (ACK NEWLEADER).
> sock.setSoTimeout(self.tickTime * self.syncLimit);
> self.setSyncMode(QuorumPeer.SyncMode.NONE);
> zk.startupWithoutServing();
> if (zk instanceof FollowerZooKeeperServer) {
> FollowerZooKeeperServer fzk = (FollowerZooKeeperServer) zk;
> for (PacketInFlight p : packetsNotCommitted) {
>   fzk.logRequest(p.hdr, p.rec, p.digest);
> }
> packetsNotCommitted.clear();
> }
> writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), 
> true);
> break;
> }
> {code}
> In this method, when follower receives the NEWLEADER msg, the current epoch 
> is updated before writing the uncommitted txns to the disk and writing txns 
> is done asynchronously by the SyncThreadd.  If follower crashes after setting 
> the current epoch and sending ACK_LD and before all transactions are 
> successfully written to disk, transactions loss can happen.  
> This is because leader election is based on epoch first and then transaction 
> id.  When the follower becomes a leader because it has highest epoch, it will 
> ask the other followers to truncate txns even they have been written to disk, 
> causing data loss.
> The following is the scenario
> 1. Leader election happened
> 2. A follower synced with Leader via DIFF, received committed proposals from 
> leader and kept them in memory
> 3. The follower received the NEWLEADER message
> 4. The follower updated the newEpoch
> 5. The follower was bounced  before writing all the uncommitted txns to disk
> 6. Leader shutdown and a new election triggered
> 7. Follower became the new leader because it has largest currentEpoch
> 8. New leader asked other followers to truncate their committed txns and 
> transactions got lost



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4758) Upgrade snappy-java to 1.1.10.4 to fix CVE-2023-43642

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar resolved ZOOKEEPER-4758.
-
Resolution: Fixed

> Upgrade snappy-java to 1.1.10.4 to fix CVE-2023-43642
> -
>
> Key: ZOOKEEPER-4758
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4758
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.3
>Reporter: Dhoka Pramod
>Priority: Major
> Fix For: 3.8.4
>
>
> The SnappyInputStream was found to be vulnerable to Denial of Service (DoS) 
> attacks when decompressing data with a too large chunk size. Due to missing 
> upper bound check on chunk length, an unrecoverable fatal error can occur. 
> All versions of snappy-java including the latest released version 1.1.10.3 
> are vulnerable to this issue. A fix has been introduced in commit `9f8c3cf74` 
> which will be included in the 1.1.10.4 release. Users are advised to upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4762) Update netty jars to 4.1.99+ to fix CVE-2023-4586

2024-03-20 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar resolved ZOOKEEPER-4762.
-
Resolution: Fixed

> Update netty jars to 4.1.99+ to fix CVE-2023-4586
> -
>
> Key: ZOOKEEPER-4762
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4762
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.3
>Reporter: Dhoka Pramod
>Priority: Critical
> Fix For: 3.8.4
>
>
> [https://nvd.nist.gov/vuln/detail/CVE-2023-4586]
> A vulnerability was found in the Hot Rod client. This security issue occurs 
> as the Hot Rod client does not enable hostname validation when using TLS, 
> possibly resulting in a man-in-the-middle (MITM) attack.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4804) Use daemon threads for Netty client

2024-03-20 Thread Istvan Toth (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829215#comment-17829215
 ] 

Istvan Toth commented on ZOOKEEPER-4804:


I think this is important enopugh to be backported to all active branches.
FYI [~andor]

> Use daemon threads for Netty client
> ---
>
> Key: ZOOKEEPER-4804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4804
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.3
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.10.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When the Netty client is used, the Java process hangs on System.exit if there 
> is an open Zookeeper connection.
> This is caused by the non-daemon threads created by Netty.
> Exiting without closing the connection is not a good practice, but this hang 
> does not happen with the NIO client, and I think ZK should behave the same 
> regardless of the client implementation used.
> The Netty ThreadFactory implementation is configurable, it shouldn't be too 
> hard make sure that daemon threads are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4816) A follower can not join the cluster for 20s seconds

2024-03-15 Thread gendong1 (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gendong1 updated ZOOKEEPER-4816:

Priority: Critical  (was: Major)

> A follower can not join the cluster for 20s seconds
> ---
>
> Key: ZOOKEEPER-4816
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4816
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.10.0
>Reporter: gendong1
>Priority: Critical
> Attachments: node1.log, node2.log, node3.log
>
>
> We encounter a strange scenario. When we set up the cluster of zookeeper(3 
> nodes totally), the third node is stuck in serializing the snapshot to the 
> local disk. However, the leader election is executed normally. After the 
> election, the third node is elected as the leader. The other two nodes fail 
> to connect with the leader. Hence, the first and second nodes restart the 
> leader election, finally the second node is elected as the leader. At this 
> time, the third node still act as the leader. There are two leaders in the 
> cluster. The first node can not join the cluster for 20s. During this 
> procedure, the client can not connect with any nodes of the cluster.
>   Runtime logs are attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4799) Refactor ACL check in addWatch command

2024-03-14 Thread Andor Molnar (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827235#comment-17827235
 ] 

Andor Molnar commented on ZOOKEEPER-4799:
-

This is the fix for 
[CVE-2024-23944|https://www.cve.org/CVERecord?id=CVE-2024-23944]

> Refactor ACL check in addWatch command
> --
>
> Key: ZOOKEEPER-4799
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4799
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Damien Diederen
>Assignee: Damien Diederen
>Priority: Major
>  Labels: CVE-2024-23944, security
> Fix For: 3.8.4, 3.9.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4799) Refactor ACL check in addWatch command

2024-03-14 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4799:

Component/s: server

> Refactor ACL check in addWatch command
> --
>
> Key: ZOOKEEPER-4799
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4799
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Damien Diederen
>Assignee: Damien Diederen
>Priority: Major
> Fix For: 3.8.4, 3.9.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4799) Refactor ACL check in addWatch command

2024-03-14 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar updated ZOOKEEPER-4799:

Labels: CVE-2024-23944 security  (was: )

> Refactor ACL check in addWatch command
> --
>
> Key: ZOOKEEPER-4799
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4799
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Damien Diederen
>Assignee: Damien Diederen
>Priority: Major
>  Labels: CVE-2024-23944, security
> Fix For: 3.8.4, 3.9.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4799) Refactor ACL check in addWatch command

2024-03-14 Thread Andor Molnar (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar resolved ZOOKEEPER-4799.
-
Fix Version/s: (was: 3.7.3)
   Resolution: Fixed

> Refactor ACL check in addWatch command
> --
>
> Key: ZOOKEEPER-4799
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4799
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Damien Diederen
>Assignee: Damien Diederen
>Priority: Major
> Fix For: 3.8.4, 3.9.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4816) A follower can not join the cluster for 20s seconds

2024-03-14 Thread gendong1 (Jira)
gendong1 created ZOOKEEPER-4816:
---

 Summary: A follower can not join the cluster for 20s seconds
 Key: ZOOKEEPER-4816
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4816
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.10.0
Reporter: gendong1
 Attachments: node1.log, node2.log, node3.log

We encounter a strange scenario. When we set up the cluster of zookeeper(3 
nodes totally), the third node is stuck in serializing the snapshot to the 
local disk. However, the leader election is executed normally. After the 
election, the third node is elected as the leader. The other two nodes fail to 
connect with the leader. Hence, the first and second nodes restart the leader 
election, finally the second node is elected as the leader. At this time, the 
third node still act as the leader. There are two leaders in the cluster. The 
first node can not join the cluster for 20s. During this procedure, the client 
can not connect with any nodes of the cluster.

  Runtime logs are attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-3975) Zookeeper crashes: Unable to load database on disk java.io.IOException: Unreasonable length

2024-03-14 Thread luoxin (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827015#comment-17827015
 ] 

luoxin commented on ZOOKEEPER-3975:
---

Increase _jute.maxbuffer may resolve it._

> Zookeeper crashes: Unable to load database on disk java.io.IOException: 
> Unreasonable length
> ---
>
> Key: ZOOKEEPER-3975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3975
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jute
>Affects Versions: 3.6.2
> Environment: Debian 10 x64
> openjdk version "11.0.8" 2020-07-14
> OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)
> OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed mode, 
> sharing)
>Reporter: Diego Lucas Jiménez
>Priority: Critical
>
> After running for a while, the entire cluster (3 zookeeper) crash suddenly, 
> all of them logging:
>  
> {code:java}
> 2020-10-16 10:37:00,459 [myid:2] - WARN [NIOWorkerThread-4:NIOServerCnxn@373] 
> - Close of session 0x0 java.io.IOException: ZooKeeperServer not running at 
> org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:544) 
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:332) at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
>  at 
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)
> 2020-10-16 10:37:00,475 [myid:2] - ERROR 
> [QuorumPeer[myid=2](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1139] - 
> Unable to load database on disk
> java.io.IOException: Unreasonable length = 5089607
> at 
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:166)
> at 
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:127)
> at 
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:159)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:768)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:352)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:258)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:303)
> at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:285)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1093)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:1249)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:868)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:941)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1428){code}
> Apparently the "corrupted" file appears in all the servers, so no solution 
> such as "removing version-2 on the faulty server and letting replicate from a 
> healthy one" :(.
> The entire cluster goes down, we have downtime, every-single-day since we 
> upgraded from 3.4.9. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ZOOKEEPER-4776) CVE-2023-36478 | org.eclipse.jetty_jetty-io

2024-03-12 Thread Trayan Simeonov (Jira)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1782#comment-1782
 ] 

Trayan Simeonov commented on ZOOKEEPER-4776:


Is this going to be prioritized any time soon. It is an opened vulnerability 
which is being opened for 4 months now.

It will be nice to have some update on this one. Thanks! 

> CVE-2023-36478 | org.eclipse.jetty_jetty-io
> ---
>
> Key: ZOOKEEPER-4776
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4776
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.9.1
>Reporter: Aayush Suri
>Priority: Major
>
> {*}Vulnerability summary{*}: Eclipse Jetty provides a web server and servlet 
> container. In versions 11.0.0 through 11.0.15, 10.0.0 through 10.0.15, and 
> 9.0.0 through 9.4.52, an integer overflow in `MetaDataBuilder.checkSize` 
> allows for HTTP/2 HPACK header values to exceed their size limit. 
> `MetaDataBuilder.java` determines if a header name or value exceeds the size 
> limit, and throws an exception if the limit is exceeded. However, when length 
> is very large and huffman is true, the multiplication by 4 in line 295 will 
> overflow, and length will become negative. `(_size+length)` will now be 
> negative, and the check on line 296 will not be triggered. Furthermore, 
> `MetaDataBuilder.checkSize` allows for user-entered HPACK header value sizes 
> to be negative, potentially leading to a very large buffer allocation later 
> on when the user-entered size is multiplied by 2. This means that if a user 
> provides a negative length value (or, more precisely, a length value which, 
> when multiplied by the 4/3 fudge factor, is negative), and this length value 
> is a very large positive number when multiplied by 2, then the user can cause 
> a very large buffer to be allocated on the server. Users of HTTP/2 can be 
> impacted by a remote denial of service attack. The issue has been fixed in 
> versions 11.0.16, 10.0.16, and 9.4.53. There are no known workarounds.
> Looking for a version the fixes this vulnerability. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ZOOKEEPER-4815) custom the data format of /zookeeper/config

2024-03-12 Thread yangoofy (Jira)
yangoofy created ZOOKEEPER-4815:
---

 Summary: custom the data format of /zookeeper/config
 Key: ZOOKEEPER-4815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4815
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: yangoofy


When using QuorumMaj, I hope to support custom /zookeeper/config node data 
formats, such as
server. x=xx. xx. xx. xx: 2888:3888: observer; 0.0.0.0:2181; Group1
server. y=xx. xx. xx. xx: 2888:3888: observer; 0.0.0.0:2181; Group2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4804) Use daemon threads for Netty client

2024-03-08 Thread Zili Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen updated ZOOKEEPER-4804:
-
Fix Version/s: 3.10.0

> Use daemon threads for Netty client
> ---
>
> Key: ZOOKEEPER-4804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4804
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.8.3
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.10.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When the Netty client is used, the Java process hangs on System.exit if there 
> is an open Zookeeper connection.
> This is caused by the non-daemon threads created by Netty.
> Exiting without closing the connection is not a good practice, but this hang 
> does not happen with the NIO client, and I think ZK should behave the same 
> regardless of the client implementation used.
> The Netty ThreadFactory implementation is configurable, it shouldn't be too 
> hard make sure that daemon threads are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ZOOKEEPER-4276) Serving only with secureClientPort fails

2024-03-08 Thread Zili Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zili Chen resolved ZOOKEEPER-4276.
--
Fix Version/s: 3.10.0
 Assignee: Abhilash Kishore
   Resolution: Fixed

master via bc1fc6d36435d3fad7b31642b647dc7680d7866e.

> Serving only with secureClientPort fails
> 
>
> Key: ZOOKEEPER-4276
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4276
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.7.0, 3.5.8, 3.6.2, 3.8.0
>Reporter: Kei Kori
>Assignee: Abhilash Kishore
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.10.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> clientPort in zoo.cfg is forcefully complemented from client address by 
> QuorumPeerConfig#setupClientPort even though secureClientPort is set and 
> matches with client address' port.
> Because of this behavior, in case rolling update with replacing clientPort to 
> secureClientPort in the same port number following [Upgrading existing 
> non-TLS cluster with no 
> downtime|https://zookeeper.apache.org/doc/r3.7.0/zookeeperAdmin.html#Upgrading+existing+nonTLS+cluster]
>  conflicts and gets errors below.
> {code}
> 2021-03-29 23:21:58,638 - INFO  [main:NettyServerCnxnFactory@590] - binding 
> to port /0.0.0.0:2281
> 2021-03-29 23:21:58,748 - INFO  [main:NettyServerCnxnFactory@595] - bound to 
> port 2281
> 2021-03-29 23:21:58,749 - INFO  [main:NettyServerCnxnFactory@590] - binding 
> to port 0.0.0.0/0.0.0.0:2281
> 2021-03-29 23:21:58,753 - ERROR [main:QuorumPeerMain@101] - Unexpected 
> exception, exiting abnormally
> java.net.BindException: Address already in use
> {code}
> QuorumPeerConfig#setupClientPort should complement only when both clientPort 
> and secureClientPort are empty, and allow serving zookeeper server only with 
> secure client port.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ZOOKEEPER-4814) Protocol desynchronization after Connect for (some) old clients

2024-03-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4814:
--
Labels: pull-request-available  (was: )

> Protocol desynchronization after Connect for (some) old clients
> ---
>
> Key: ZOOKEEPER-4814
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4814
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.9.0
>Reporter: Damien Diederen
>Assignee: Damien Diederen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some old clients experience a protocol synchronization after receiving a 
> {{ConnectResponse}} from the server.
> This started happening with ZOOKEEPER-4492, "Merge readOnly field into 
> ConnectRequest and Response," which writes overlong responses to clients 
> which do not know about the {{readOnly}} flag.
> (One example of such a client is ZooKeeper's own C client library prior to 
> version 3.5!)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >