I ran into what looks like a deadlock in blockUntilConnected and wanted to give a high-level description in case someone can help me debug the issue. I can try to make a reproducible example, but for reasons that will be apparent, that's not straightforward.
I am using Curator within a custom Kafka Connect source. As a result, I have a process per node on 11 nodes, and up to 12 tasks (threads) per node, each with its own Curator client. Every node is also running zookeeper, so I initialize the Curator clients by pointing to localhost:2181. On 9 nodes, everything works perfectly, but on the other 2, all tasks seem to hang at blockUntilConnected (specifically here: https://github.com/apache/curator/blob/ae309a29643afc6df511d1d9a162526ce474598b/curator-framework/src/main/java/org/apache/curator/framework/state/ConnectionStateManager.java#L224). I found this by observing no activity in my Kafka Connect logs and grabbing a stacktrace via jstack on the offending nodes. I also made a small test program that just initializes a client and runs blockUntilConnected (nothing else) and ran it at the same time, and it also hangs there forever. Meanwhile, I can use zookeeper-shell on localhost just fine, and if I initialize a Curator client pointing to one of the other nodes (not localhost) the Curator client initializes fine. Is this a possible deadlock from initializing Curator clients across multiple threads concurrently?