symat commented on a change in pull request #1228: ZOOKEEPER-3698: fixing
NoRouteToHostException when starting large cluster locally
URL: https://github.com/apache/zookeeper/pull/1228#discussion_r368626386
##########
File path: zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md
##########
@@ -1542,6 +1555,22 @@ the variable does.
ZAB protocol and the Fast Leader Election protocol. Default
value is **false**.
+* *multiAddress.reachabilityCheckEnabled* :
+ (Java system property: **zookeeper.multiAddress.reachabilityCheckEnabled**)
+ **New in 3.6.0:**
+ Since ZooKeeper 3.6.0 you can also [specify multiple
addresses](#id_multi_address)
+ for each ZooKeeper server instance (this can increase availability when
multiple physical
+ network interfaces can be used parallel in the cluster). ZooKeeper will
perform ICMP ECHO requests
+ or try to establish a TCP connection on port 7 (Echo) of the destination
host in order to find
+ the reachable addresses. This happens only if you provide multiple
addresses in the configuration.
+ The reachable check can fail if you hit some ICMP rate-limitation, (e.g.
on MacOS) when you try to
+ start a large (e.g. 11+) ensemble members cluster on a single machine for
testing.
+
+ Default value is **true**. By setting this parameter to 'false' you can
disable the reachability checks.
+ Please note, disabling the reachability check will cause the cluster not
to be able to reconfigure
+ itself properly during network problems, so the disabling is advised only
during testing.
Review comment:
After thinking a bit more:
One other improvement can be to implement something like how the Learner is
doing this right now (if I remember correctly, it basically starts to connect
to all known Quorum ports in parallel, then keep the connection which is
established first). However, it might be more tricky in case of the Leader
Election protocol...
An other way would be just to try to establish a connection to the election
addresses one-by-one, and go to the next one if the call fails. It would be
slower, but we wouldn't rely on `InetAddress.isReachable()`.
However, in both cases, it can be tricky to detect if the current election
address become unavailable. This is an other edge case where we use
`InetAddress.isReachable()`. (this is why we call the
`SendWorker.asyncValidateIfSocketIsStillReachable()`)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services