[
https://issues.apache.org/jira/browse/ZOOKEEPER-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mate Szalay-Beko updated ZOOKEEPER-3698:
----------------------------------------
Description:
During testing RC for 3.6.0, we found that ZooKeeper cluster with large number
of ensemble members (e.g. 23) can not start properly. We see a lot of warnings
in the log:
{code:java}
2020-01-15 20:02:13,431 [myid:13] - WARN
[ListenerHandler-phunt-MBP13.local/192.168.1.91:4193:QuorumCnxManager@691]
- None of the addresses (/192.168.1.91:4190) are reachable for sid 10
java.net.NoRouteToHostException: No valid address among [/192.168.1.91:4190]
{code}
and also:
{code:java}
2020-01-17 11:02:26,177 [myid:4] - WARN
[Thread-2531:QuorumCnxManager$SendWorker@1269] - destination address /127.0.0.1
not reachable anymore, shutting down the SendWorker for sid 6
{code}
The exceptions are happening when the new MultiAddress feature tries to filter
the unreachable hosts from the address list. This involves the calling of the
InetAddress.isReachable method with a default timeout of 500ms, which goes down
to a native call in java and basically try to do a ping (an ICMP echo request)
to the host. Naturally, the localhost should be always reachable. For some
reason, this call gets failed (timeouted or simly refused) on mac if we have
many ensemble members. I tested with 9 members and the cluster started
properly. With 11-13-15 members it took more and more time to get the cluster
to start, and the "NoRouteToHostException" started to appear in the logs. After
around 1 minute the 15 ensemble members cluster started, but obviously this is
not good this way. (I also tried with JDK 11 but the I found the same behaviour)
On linux, I haven't been able to reproduce the problem. I tried with 5, 9, 15
and 23 ensemble members and the quorum always seems to start properly in a few
seconds. (I used OpenJDK 1.8.232 on Ubuntu 18.04)
*Update*:
On mac, we we have the ICMP rate limit set to 250 by default. You can turn this
off using the following command: sudo sysctl -w net.inet.icmp.icmplim=0
(see [https://krypted.com/mac-os-x/disable-icmp-rate-limiting-os-x/])
Using the above command before starting the 23 ensemble members cluster locally
seems to solve the problem for me. (can someone verify?) The question is if
this workaround is enough or not.
As far as I can tell, the current code will generate {{2*A*(M-1)}} ICMP calls
in each ZooKeeper server during startup, if {{'M'}} is the number of ensemble
members and {{'A'}} is the number of election addresses provided for each
member. This is not that high, if each ZooKeeper server is started on a
different machine, but if we start a lot of ZooKeeper servers on a single
machine, then it can quickly go beyond the predefined limit of 250 for mac.
was:
During testing RC for 3.6.0, we found that ZooKeeper cluster with large number
of ensemble members (e.g. 23) can not start properly. We see a lot of warnings
in the log:
{code:java}
2020-01-15 20:02:13,431 [myid:13] - WARN
[ListenerHandler-phunt-MBP13.local/192.168.1.91:4193:QuorumCnxManager@691]
- None of the addresses (/192.168.1.91:4190) are reachable for sid 10
java.net.NoRouteToHostException: No valid address among [/192.168.1.91:4190]
{code}
and also:
{code:java}
2020-01-17 11:02:26,177 [myid:4] - WARN
[Thread-2531:QuorumCnxManager$SendWorker@1269] - destination address /127.0.0.1
not reachable anymore, shutting down the SendWorker for sid 6
{code}
The exceptions are happening when the new MultiAddress feature tries to filter
the unreachable hosts from the address list. This involves the calling of the
InetAddress.isReachable method with a default timeout of 500ms, which goes down
to a native call in java and basically try to do a ping (an ICMP echo request)
to the host. Naturally, the localhost should be always reachable. For some
reason, this call gets failed (timeouted or simly refused) on mac if we have
many ensemble members. I tested with 9 members and the cluster started
properly. With 11-13-15 members it took more and more time to get the cluster
to start, and the "NoRouteToHostException" started to appear in the logs. After
around 1 minute the 15 ensemble members cluster started, but obviously this is
not good this way. (I also tried with JDK 11 but the I found the same behaviour)
On linux, I haven't been able to reproduce the problem. I tried with 5, 9, 15
and 23 ensemble members and the quorum always seems to start properly in a few
seconds. (I used OpenJDK 1.8.232 on Ubuntu 18.04)
Update:
On mac, we we have the ICMP rate limit set to 250 by default. You can turn this
off using the following command: sudo sysctl -w net.inet.icmp.icmplim=0
(see https://krypted.com/mac-os-x/disable-icmp-rate-limiting-os-x/)
Using the above command before starting the 23 ensemble members cluster locally
seems to solve the problem for me. (can someone verify?) The question is if
this workaround is enough or not.
As far as I can tell, the current code will generate {{2*A*(M-1)}} ICMP calls
in each ZooKeeper server during startup, if {{'M'}} is the number of ensemble
members and {{'A'}} is the number of election addresses provided for each
member. This is not that high, if each ZooKeeper server is started on a
different machine, but if we start a lot of ZooKeeper servers on a single
machine, then it can quickly go beyond the predefined limit of 250 for mac.
> NoRouteToHostException when starting large ZooKeeper cluster on localhost
> -------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3698
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3698
> Project: ZooKeeper
> Issue Type: Bug
> Reporter: Mate Szalay-Beko
> Assignee: Mate Szalay-Beko
> Priority: Major
> Fix For: 3.6.0
>
>
> During testing RC for 3.6.0, we found that ZooKeeper cluster with large
> number of ensemble members (e.g. 23) can not start properly. We see a lot of
> warnings in the log:
> {code:java}
> 2020-01-15 20:02:13,431 [myid:13] - WARN
> [ListenerHandler-phunt-MBP13.local/192.168.1.91:4193:QuorumCnxManager@691]
> - None of the addresses (/192.168.1.91:4190) are reachable for sid 10
> java.net.NoRouteToHostException: No valid address among [/192.168.1.91:4190]
> {code}
> and also:
> {code:java}
> 2020-01-17 11:02:26,177 [myid:4] - WARN
> [Thread-2531:QuorumCnxManager$SendWorker@1269] - destination address
> /127.0.0.1 not reachable anymore, shutting down the SendWorker for sid 6
> {code}
> The exceptions are happening when the new MultiAddress feature tries to
> filter the unreachable hosts from the address list. This involves the calling
> of the InetAddress.isReachable method with a default timeout of 500ms, which
> goes down to a native call in java and basically try to do a ping (an ICMP
> echo request) to the host. Naturally, the localhost should be always
> reachable. For some reason, this call gets failed (timeouted or simly
> refused) on mac if we have many ensemble members. I tested with 9 members and
> the cluster started properly. With 11-13-15 members it took more and more
> time to get the cluster to start, and the "NoRouteToHostException" started to
> appear in the logs. After around 1 minute the 15 ensemble members cluster
> started, but obviously this is not good this way. (I also tried with JDK 11
> but the I found the same behaviour)
>
> On linux, I haven't been able to reproduce the problem. I tried with 5, 9, 15
> and 23 ensemble members and the quorum always seems to start properly in a
> few seconds. (I used OpenJDK 1.8.232 on Ubuntu 18.04)
> *Update*:
> On mac, we we have the ICMP rate limit set to 250 by default. You can turn
> this off using the following command: sudo sysctl -w net.inet.icmp.icmplim=0
> (see [https://krypted.com/mac-os-x/disable-icmp-rate-limiting-os-x/])
> Using the above command before starting the 23 ensemble members cluster
> locally seems to solve the problem for me. (can someone verify?) The question
> is if this workaround is enough or not.
> As far as I can tell, the current code will generate {{2*A*(M-1)}} ICMP calls
> in each ZooKeeper server during startup, if {{'M'}} is the number of ensemble
> members and {{'A'}} is the number of election addresses provided for each
> member. This is not that high, if each ZooKeeper server is started on a
> different machine, but if we start a lot of ZooKeeper servers on a single
> machine, then it can quickly go beyond the predefined limit of 250 for mac.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)