[ https://issues.apache.org/jira/browse/ZOOKEEPER-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748086#comment-17748086 ]
Luke Chen edited comment on ZOOKEEPER-4708 at 7/27/23 12:04 PM: ---------------------------------------------------------------- PR: [https://github.com/apache/zookeeper/pull/2040] raised to fix partial of this issue: force to re-resolve hostname into IP when binding (i.e. ZOOKEEPER-4728) was (Author: showuon): PR: [https://github.com/apache/zookeeper/pull/2040] raised to fix partial of this issue: force to re-resolve hostname into IP when binding > ZooKeeper 3.6.4 quorum failing due to <unresolved> address > ---------------------------------------------------------- > > Key: ZOOKEEPER-4708 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4708 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.6.4, 3.8.1 > Reporter: Paolo Patierno > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > We work on the Strimzi project which is about deploying an Apache Kafka > cluster on Kubernetes together with a ZooKeeper ensamble. > Until ZooKeeper version 3.6.3 (brought by Kafka 3.4.0), there were no issues > when running on minikube for development purposes. > With using ZooKeeper version 3.6.4 (brought by Kafka 3.4.1), we started to > have issues during the quorum formation and leader election. > The first one was about ZooKeeper pods not able to bind the quorum port 3888 > to the Cluster IP but during the DNS resolution they get the loopback address > instead. > Following a possible log at ZooKeeper startup where you can see the binding > at 127.0.0.1:3888 instead of something like 172.17.0.4:3888 (so getting a > valid not loopback IP address). > > {code:java} > INFO 3 is accepting connections now, my election bind port: > my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888 > (org.apache.zookeeper.server.quorum.QuorumCnxManager) > [ListenerHandler-my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888] > This specific issue had two solutions: using quorumListenOnAllIPs=true on > ZooKeeper configuration or binding to 0.0.0.0 address. {code} > > Anyway it is actually not clear why it wasn't needed until 3.6.3, but needed > for getting 3.6.4 working. What is changed from this perspective? > Said that, While binding to 0.0.0.0 seems to work fine, using the > quorumListenOnAllIPs=true doesn't. > Assuming a ZooKeeper ensamble with 3 nodes, Getting the log of the current > ZooKeeper leader (ID=3) we see the following. > (Starting with ** you can see some additional logs added to > {{org.apache.zookeeper.server.quorum.Leader#getDesignatedLeader}} in order to > get more information.) > {code:java} > 2023-06-19 12:32:51,990 INFO Have quorum of supporters, sids: [[1, 3],[1, > 3]]; starting up and setting last processed zxid: 0x100000000 > (org.apache.zookeeper.server.quorum.Leader) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,990 INFO ** > newQVAcksetPair.getQuorumVerifier().getVotingMembers().get(self.getId()).addr > = > my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888 > (org.apache.zookeeper.server.quorum.Leader) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,990 INFO ** self.getQuorumAddress() = > my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/<unresolved>:2888 > (org.apache.zookeeper.server.quorum.Leader) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,992 INFO ** qs.addr > my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888, > qs.electionAddr > my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:3888, > qs.clientAddr/127.0.0.1:12181 > (org.apache.zookeeper.server.quorum.QuorumPeer) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,992 DEBUG zookeeper > (org.apache.zookeeper.common.PathTrie) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,993 WARN Restarting Leader Election > (org.apache.zookeeper.server.quorum.QuorumPeer) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code} > So the leader is ZooKeeper with ID=3 and it was ACKed by the ZooKeeper node > ID=1. > As you can see we are in the {{Leader#startZkServer}} method, and because of > the reconfiguration enabled, the designatedLeader is processed. The problem > is that the {{Leader#getDesignatedLeader}} is not returning “self” as leader > but another one (ID=1), because of the difference in the quorum address. > From the above log, it’s not an actual difference in terms of addresses but > the {{self.getQuorumAddress()}} is returning an <unresolved> (even if it’s > still the same hostname related to ZooKeeper-2 instance). This difference > causes the allowedToCommit=false, meanwhile the ZooKeeper-2 is still reported > as leader but it’s not able to commit, so prevents any requests and the > ZooKeeper ensemble gets stuck. > {code:java} > 2023-06-19 12:32:51,996 WARN Suggested leader: 1 > (org.apache.zookeeper.server.quorum.QuorumPeer) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] > 2023-06-19 12:32:51,996 WARN This leader is not the designated leader, it > will be initialized with allowedToCommit = false > (org.apache.zookeeper.server.quorum.Leader) > [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code} > The overall issue could be related to DNS problems, with DNS records not > registered yet during pod initialization (where ZooKeeper is running on > Kubernetes). But we don’t understand why it’s not able to recover somehow. > What we don't get a reason is why ZooKeeper 3.6.3 didn't need any binding > specific configuration and was working just fine, while the new 3.6.4 needs > that. Other than that, why we see the above described issue after the binding > works fine. > We also tried by using latest ZooKeeper 3.8.1 facing the same issue. > It's worth mentioning that the issue wasn't reproducible in environment like > AKS or OpenShift, but if it's somehow slow DNS related, it could happen > anyway (even not on minikube as in our use case). > FYI, the discussion we had in the upstream Strimzi community > [https://github.com/strimzi/strimzi-kafka-operator/issues/8675] > There is also a thread on the ZooKeeper users mailing list > https://lists.apache.org/thread/vsff39wpsrxb431hdw9db4np12vytdpm -- This message was sent by Atlassian Jira (v8.20.10#820010)