[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737449#comment-17737449
 ] 

Paolo Patierno commented on ZOOKEEPER-4708:
-------------------------------------------

I went with git bisect between release-3.6.3 tag (good) and release-3.6.4 tag 
(bad). It ended highlighting the following commit as the reason of a not 
working 3.6.4:
{code:java}
357e88c1438780e28d36bf54784937e18547e136 is the first bad commit
commit 357e88c1438780e28d36bf54784937e18547e136
Author: Enrico Olivelli <eolive...@apache.org>
Date:   Tue Jan 25 12:48:34 2022 +0000    ZOOKEEPER-3988: 
rg.apache.zookeeper.server.NettyServerCnxn.receiveMessage throws 
NullPointerException
    
    Modifications:
    - prevent the NPE, the code that throws NPE is only to record some metrics 
for non TLS requests
    
    Related to:
    - apache/pulsar#11070
    - https://github.com/pravega/zookeeper-operator/issues/393
    
    Author: Enrico Olivelli <eolive...@apache.org>
    
    Reviewers: Nicolo² Boschi <boschi1...@gmail.com>, Andor Molnar 
<an...@apache.org>, Mate Szalay-Beko <sy...@apache.org>
    
    Closes #1798 from eolivelli/fix/ZOOKEEPER-3988-npe
    
    (cherry picked from commit 957f8fc0afbeca638f13f6fb739e49a921da2b9d)
    Signed-off-by: Mate Szalay-Beko <sy...@apache.org> 
.../zookeeper/server/NettyServerCnxnFactory.java   | 18 ++++++++++-----
 .../zookeeper/server/NettyServerCnxnTest.java      | 26 +++++++++++++++-------
 .../apache/zookeeper/server/TxnLogCountTest.java   |  2 +-
 3 files changed, 31 insertions(+), 15 deletions(-) {code}
Taking a look at the NettServerCnxnFactory class, it's just adding a check 
around zkServer to avoid that an NPE is raised on calling 
zkServer.serverStats() when it's null. I think there is nothing wrong with it, 
but when the NPE was raised before the fix, it forced the container restarting 
and the ZooKeeper nodes were able to form the quorum. Avoiding the NPE seems to 
leave ZooKeeper in a situation where it's not able to recover and form the 
quorum, it's stuck.

At this point my question could be, is it normal that zkServer is null? Is it 
showing a more subtle bug? The NPE wasn't happening with 3.6.3.

 

> ZooKeeper 3.6.4 quorum failing due to <unresolved> address
> ----------------------------------------------------------
>
>                 Key: ZOOKEEPER-4708
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4708
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.6.4, 3.8.1
>            Reporter: Paolo Patierno
>            Priority: Major
>
> We work on the Strimzi project which is about deploying an Apache Kafka 
> cluster on Kubernetes together with a ZooKeeper ensamble.
> Until ZooKeeper version 3.6.3 (brought by Kafka 3.4.0), there were no issues 
> when running on minikube for development purposes.
> With using ZooKeeper version 3.6.4 (brought by Kafka 3.4.1), we started to 
> have issues during the quorum formation and leader election.
> The first one was about ZooKeeper pods not able to bind the quorum port 3888 
> to the Cluster IP but during the DNS resolution they get the loopback address 
> instead.
> Following a possible log at ZooKeeper startup where you can see the binding 
> at 127.0.0.1:3888 instead of something like 172.17.0.4:3888 (so getting a 
> valid not loopback IP address).
>  
> {code:java}
> INFO 3 is accepting connections now, my election bind port: 
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888 
> (org.apache.zookeeper.server.quorum.QuorumCnxManager) 
> [ListenerHandler-my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888]
> This specific issue had two solutions: using quorumListenOnAllIPs=true on 
> ZooKeeper configuration or binding to 0.0.0.0 address. {code}
>  
> Anyway it is actually not clear why it wasn't needed until 3.6.3, but needed 
> for getting 3.6.4 working. What is changed from this perspective?
> Said that, While binding to 0.0.0.0 seems to work fine, using the 
> quorumListenOnAllIPs=true doesn't.
> Assuming a ZooKeeper ensamble with 3 nodes, Getting the log of the current 
> ZooKeeper leader (ID=3) we see the following.
> (Starting with ** you can see some additional logs added to 
> {{org.apache.zookeeper.server.quorum.Leader#getDesignatedLeader}} in order to 
> get more information.)
> {code:java}
> 2023-06-19 12:32:51,990 INFO Have quorum of supporters, sids: [[1, 3],[1, 
> 3]]; starting up and setting last processed zxid: 0x100000000 
> (org.apache.zookeeper.server.quorum.Leader) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,990 INFO ** 
> newQVAcksetPair.getQuorumVerifier().getVotingMembers().get(self.getId()).addr 
> = 
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888 
> (org.apache.zookeeper.server.quorum.Leader) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,990 INFO ** self.getQuorumAddress() = 
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/<unresolved>:2888
>  (org.apache.zookeeper.server.quorum.Leader) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,992 INFO ** qs.addr 
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888,
>  qs.electionAddr 
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:3888,
>  qs.clientAddr/127.0.0.1:12181 
> (org.apache.zookeeper.server.quorum.QuorumPeer) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,992 DEBUG zookeeper 
> (org.apache.zookeeper.common.PathTrie) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,993 WARN Restarting Leader Election 
> (org.apache.zookeeper.server.quorum.QuorumPeer) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code}
> So the leader is ZooKeeper with ID=3 and it was ACKed by the ZooKeeper node 
> ID=1.
> As you can see we are in the {{Leader#startZkServer}} method, and because of 
> the reconfiguration enabled, the designatedLeader is processed. The problem 
> is that the {{Leader#getDesignatedLeader}} is not returning “self” as leader 
> but another one (ID=1), because of the difference in the quorum address.
> From the above log, it’s not an actual difference in terms of addresses but 
> the {{self.getQuorumAddress()}} is returning an <unresolved> (even if it’s 
> still the same hostname related to ZooKeeper-2 instance). This difference 
> causes the allowedToCommit=false, meanwhile the ZooKeeper-2 is still reported 
> as leader but it’s not able to commit, so prevents any requests and the 
> ZooKeeper ensemble gets stuck.
> {code:java}
> 2023-06-19 12:32:51,996 WARN Suggested leader: 1 
> (org.apache.zookeeper.server.quorum.QuorumPeer) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,996 WARN This leader is not the designated leader, it 
> will be initialized with allowedToCommit = false 
> (org.apache.zookeeper.server.quorum.Leader) 
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code}
> The overall issue could be related to DNS problems, with DNS records not 
> registered yet during pod initialization (where ZooKeeper is running on 
> Kubernetes). But we don’t understand why it’s not able to recover somehow.
> What we don't get a reason is why ZooKeeper 3.6.3 didn't need any binding 
> specific configuration and was working just fine, while the new 3.6.4 needs 
> that. Other than that, why we see the above described issue after the binding 
> works fine.
> We also tried by using latest ZooKeeper 3.8.1 facing the same issue.
> It's worth mentioning that the issue wasn't reproducible in environment like 
> AKS or OpenShift, but if it's somehow slow DNS related, it could happen 
> anyway (even not on minikube as in our use case).
> FYI, the discussion we had in the upstream Strimzi community 
> [https://github.com/strimzi/strimzi-kafka-operator/issues/8675]
> There is also a thread on the ZooKeeper users mailing list  
> https://lists.apache.org/thread/vsff39wpsrxb431hdw9db4np12vytdpm



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to