---> "Can it happen that we end up with 2 leaders or 0 leader for some period of
time (for example, during network delays/partitions)?"
look at the code:
https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java#L340
Github user anmolnar commented on the issue:
https://github.com/apache/zookeeper/pull/703
Committed to master branch. Thanks @lvfangmin !
Please create another pull request for branch-3.5.
---
Github user asfgit closed the pull request at:
https://github.com/apache/zookeeper/pull/703
---
[
https://issues.apache.org/jira/browse/ZOOKEEPER-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711956#comment-16711956
]
Norbert Kalmar commented on ZOOKEEPER-3207:
---
Thanks [~lvfangmin] for the catch!
I'll create
Github user anmolnar commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239615050
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/FileChangeWatcher.java
---
@@ -0,0 +1,253 @@
+/**
+ * Licensed to the
Github user anmolnar commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239614732
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/X509Util.java ---
@@ -446,4 +458,119 @@ private void
> it seems like the
> inconsistency may be caused by the partition of the Zookeeper cluster
> itself
Yes - there are many ways in which you can end up with 2 leaders. However, if
properly tuned and configured, it will be for a few seconds at most. During a
GC pause no work is being done anyway.
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/2144/
###
## LAST 60 LINES OF THE CONSOLE
###
[...truncated 43.48 KB...]
[junit] Running
> Old service leader will detect network partition max 15 seconds after it
> happened.
If the old service leader is in a very long GC it will not detect the
partition. In the face of VM pauses, etc. it's not possible to avoid 2 leaders
for a short period of time.
-JZ
Github user ivmaykov commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239593442
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/FileChangeWatcher.java
---
@@ -0,0 +1,253 @@
+/**
+ * Licensed to the
Github user ivmaykov commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239593417
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/X509Util.java ---
@@ -446,4 +458,119 @@ private void
Github user ivmaykov commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239593458
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/X509Util.java ---
@@ -446,4 +458,119 @@ private void
Github user ivmaykov commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239593424
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/X509Util.java ---
@@ -446,4 +458,119 @@ private void
Hello,
Ensuring reliability requires to use consensus directly in your service or
change the service to use distributed log/journal (e.g. bookkeeper).
However following idea is simple and in many situation good enough.
If you configure session timeout to 15 seconds - then zookeeper client will
Github user lvfangmin commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/628#discussion_r239578960
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/ObserverMaster.java
---
@@ -0,0 +1,514 @@
+/**
+ * Licensed to the
[
https://issues.apache.org/jira/browse/ZOOKEEPER-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712162#comment-16712162
]
Michael Han commented on ZOOKEEPER-3188:
Appreciate detailed reply, agree on replies on 1 and
See https://builds.apache.org/job/ZooKeeper_branch34_jdk8/1618/
###
## LAST 60 LINES OF THE CONSOLE
###
[...truncated 42.78 KB...]
[junit] Running
Github user ivmaykov commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/680#discussion_r239641391
--- Diff:
zookeeper-server/src/main/java/org/apache/zookeeper/common/X509Util.java ---
@@ -446,4 +458,119 @@ private void
Github user ivmaykov commented on the issue:
https://github.com/apache/zookeeper/pull/680
@anmolnar removed finalizer, use explicit close()
---
It is not possible to achieve the level of consistency you're after in an
eventually consistent system such as ZooKeeper. There will always be an edge
case where two ZooKeeper clients will believe they are leaders (though for a
short period of time). In terms of how it affects Apache Curator,
Tweak timeout is tempting as your solution might work most of the time yet
fail in certain cases (which others have pointed out). If the goal is
absolute correctness then we should avoid timeout, which does not guarantee
correctness as it only makes the problem hard to manifest. Fencing is the
We are planning to run Zookeeper nodes embedded with the client nodes.
I.e., each client runs also a ZK node. So, network partition will
disconnect a ZK node and not only the client.
My concern is about the following statement from the ZK documentation:
"Timeliness: The clients view of the system
[
https://issues.apache.org/jira/browse/ZOOKEEPER-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711371#comment-16711371
]
Hudson commented on ZOOKEEPER-1818:
---
FAILURE: Integrated in Jenkins build
See https://builds.apache.org/job/ZooKeeper-trunk/296/
###
## LAST 60 LINES OF THE CONSOLE
###
[...truncated 177.58 KB...]
[junit] Running
[
https://issues.apache.org/jira/browse/ZOOKEEPER-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711396#comment-16711396
]
Hudson commented on ZOOKEEPER-1818:
---
FAILURE: Integrated in Jenkins build ZooKeeper-trunk #296 (See
GitHub user stanlyDoge opened a pull request:
https://github.com/apache/zookeeper/pull/732
typo
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/stanlyDoge/zookeeper patch-1
Alternatively you can review and apply these changes
ZK is able to guarantee that there is only one leader for the purposes of
updating ZK data. That is because all commits have to originate with the
current quorum leader and then be acknowledged by a quorum of the current
cluster. IF the leader can't get enough acks, then it has de facto lost
GitHub user TisonKun opened a pull request:
https://github.com/apache/zookeeper/pull/733
hotfix: Fix type in zookeeperInternals.md
I think this quick fix is far from need a JIRA. If needed, I would create
one.
You can merge this pull request into a Git repository by running:
Github user lvfangmin commented on the issue:
https://github.com/apache/zookeeper/pull/703
Thanks @anmolnar, I'll send out the PR for 3.5.
---
Github user lvfangmin commented on the issue:
https://github.com/apache/zookeeper/pull/628
Thanks @enixon for doing the update and rebase, went through this again,
looks legit to me. I also compared with internal version and made sure this has
included all the improvement and bug
Thanks Jordan,
Yes, I will try Curator.
Also, beyond the problem described in the Tech Note, it seems like the
inconsistency may be caused by the partition of the Zookeeper cluster
itself. E.g., if a "leader" client is connected to the partitioned ZK node,
it may be notified not at the same time
31 matches
Mail list logo