[
https://issues.apache.org/jira/browse/FLINK-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095457#comment-17095457
]
Chesnay Schepler edited comment on FLINK-17443 at 4/29/20, 1:36 PM:
--------------------------------------------------------------------
I'm not actively working on it; I can assign it to you. You basically have to
do the thing you proposed for Flink, just against the
[flink-shaded|https://github.com/apache/flink-shaded/] repo.
I would then close this ticket as a duplicate.
was (Author: zentol):
I'm not actively working on it; I can assign it to you. You basically have to
the thing you proposed for Flink, just against the
[flink-shaded|https://github.com/apache/flink-shaded/] repo.
I would then close this ticket as a duplicate.
> Flink's ZK in HA mode setup is unable to start up if any of the zk hosts are
> unreachable
> ----------------------------------------------------------------------------------------
>
> Key: FLINK-17443
> URL: https://issues.apache.org/jira/browse/FLINK-17443
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Reporter: Piyush Narang
> Priority: Major
> Labels: pull-request-available
>
> We occasionally hit an issue where our Flink cluster will not startup if any
> of the zookeeper hosts passed in the "high-availability.zookeeper.quorum"
> config setting are unreachable. This seems to stem from us using an older
> zookeeper dependency version (3.4.10).
> Sample error we see is shown below.
> This error seems to stem from us being on an older zookeeper release
> (3.4.10). This has been fixed as part of:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1576 in the 3.4.x branch
> ([https://github.com/apache/zookeeper/commit/be1409cc9a14ac2e28693e0e02a0ba6d9713565e]).
>
> {code:java}
> java.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not
> knownjava.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service
> not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at
> java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at
> java.net.InetAddress.getAllByName0(InetAddress.java:1277) at
> java.net.InetAddress.getAllByName(InetAddress.java:1193) at
> java.net.InetAddress.getAllByName(InetAddress.java:1127) at
> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
> at
> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
> at
> org.apache.flink.shaded.curator.org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
> at
> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:150)
> at
> org.apache.flink.shaded.curator.org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94)
> at
> org.apache.flink.shaded.curator.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
> at
> org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.reset(ConnectionState.java:262)
> at
> org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.start(ConnectionState.java:109)
> at
> org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:191)
> at
> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:259)
> at
> org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:131)
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:123)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:292)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:257){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)