[
https://issues.apache.org/jira/browse/ZOOKEEPER-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110780#comment-17110780
]
Keli Wang edited comment on ZOOKEEPER-3829 at 5/19/20, 2:53 AM:
----------------------------------------------------------------
{code:java}
if (curQV.getVersion() == 0 && curQV.getVersion() == lastSeenQV.getVersion()) {
// This was added in ZOOKEEPER-1783. The initial config has version 0 (not
explicitly
// specified by the user; the lack of version in a config file is
interpreted as version=0).
// As soon as a config is established we would like to increase its version
so that it
// takes presedence over other initial configs that were not established
(such as a config
// of a server trying to join the ensemble, which may be a partial view of
the system, not the full config).
// We chose to set the new version to the one of the NEWLEADER message.
However, before we can do that
// there must be agreement on the new version, so we can only change the
version when sending/receiving UPTODATE,
// not when sending/receiving NEWLEADER. In other words, we can't change
curQV here since its the committed quorum verifier,
// and there's still no agreement on the new version that we'd like to use.
Instead, we use
// lastSeenQuorumVerifier which is being sent with NEWLEADER message
// so its a good way to let followers know about the new version. (The
original reason for sending
// lastSeenQuorumVerifier with NEWLEADER is so that the leader completes
any potentially uncommitted reconfigs
// that it finds before starting to propose operations. Here we're reusing
the same code path for
// reaching consensus on the new version number.)
// It is important that this is done before the leader executes
waitForEpochAck,
// so before LearnerHandlers return from their waitForEpochAck
// hence before they construct the NEWLEADER message containing
// the last-seen-quorumverifier of the leader, which we change below
try {
QuorumVerifier newQV = self.configFromString(curQV.toString());
newQV.setVersion(zk.getZxid());
self.setLastSeenQuorumVerifier(newQV, true);
} catch (Exception e) {
throw new IOException(e);
}
}
{code}
[~symat] In the code above, can leader always overwrite lastSeenQuorumVerifier
with its latest quorumVerifier when dynamic-reconfig disabled? If
lastSeenQuorumVerifier is the same as quorumVerifier, then allowedToCommit
should always be true.
was (Author: keliwang):
{code:java}
if (curQV.getVersion() == 0 && curQV.getVersion() == lastSeenQV.getVersion()) {
// This was added in ZOOKEEPER-1783. The initial config has version 0 (not
explicitly
// specified by the user; the lack of version in a config file is
interpreted as version=0).
// As soon as a config is established we would like to increase its version
so that it
// takes presedence over other initial configs that were not established
(such as a config
// of a server trying to join the ensemble, which may be a partial view of
the system, not the full config).
// We chose to set the new version to the one of the NEWLEADER message.
However, before we can do that
// there must be agreement on the new version, so we can only change the
version when sending/receiving UPTODATE,
// not when sending/receiving NEWLEADER. In other words, we can't change
curQV here since its the committed quorum verifier,
// and there's still no agreement on the new version that we'd like to use.
Instead, we use
// lastSeenQuorumVerifier which is being sent with NEWLEADER message
// so its a good way to let followers know about the new version. (The
original reason for sending
// lastSeenQuorumVerifier with NEWLEADER is so that the leader completes
any potentially uncommitted reconfigs
// that it finds before starting to propose operations. Here we're reusing
the same code path for
// reaching consensus on the new version number.)
// It is important that this is done before the leader executes
waitForEpochAck,
// so before LearnerHandlers return from their waitForEpochAck
// hence before they construct the NEWLEADER message containing
// the last-seen-quorumverifier of the leader, which we change below
try {
QuorumVerifier newQV = self.configFromString(curQV.toString());
newQV.setVersion(zk.getZxid());
self.setLastSeenQuorumVerifier(newQV, true);
} catch (Exception e) {
throw new IOException(e);
}
}
{code}
[~symat] In the code above, can leader always overwrite lastSeenQuorumVerifier
with its latest quorumVerifier when dyanmic-reconfig disabled? If
lastSeenQuorumVerifier is the same as quorumVerifier, then allowedToCommit
should always be true.
> Zookeeper refuses request after node expansion
> ----------------------------------------------
>
> Key: ZOOKEEPER-3829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3829
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.6
> Reporter: benwang li
> Priority: Major
> Attachments: d.log, screenshot-1.png
>
>
> It's easy to reproduce this bug.
> {code:java}
> //代码占位符
>
> Step 1. Deploy 3 nodes A,B,C with configuration A,B,C .
> Step 2. Deploy node ` D` with configuration `A,B,C,D` , cluster state is ok
> now.
> Step 3. Restart nodes A,B,C with configuration A,B,C,D, then the leader will
> be D, cluster hangs, but it can accept `mntr` command, other command like `ls
> /` will be blocked.
> Step 4. Restart nodes D, cluster state is back to normal now.
>
> {code}
>
> We have looked into the code of 3.5.6 version, and we found it may be the
> issue of `workerPool` .
> The `CommitProcessor` shutdown and make `workerPool` shutdown, but
> `workerPool` still exists. It will never work anymore, yet the cluster still
> thinks it's ok.
>
> I think the bug may still exist in master branch.
> We have tested it in our machines by reset the `workerPool` to null. If it's
> ok, please assign this issue to me, and then I'll create a PR.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)