[
https://issues.apache.org/jira/browse/FLINK-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425664#comment-15425664
]
ASF GitHub Bot commented on FLINK-4418:
---------------------------------------
GitHub user rehevkor5 opened a pull request:
https://github.com/apache/flink/pull/2383
[FLINK-4418] [client] Improve resilience when InetAddress.getLocalHos…
Thanks for contributing to Apache Flink. Before you open your pull request,
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your
pull request. For more information and/or questions please refer to the [How To
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful
description of your changes.
- [ ] General
- The pull request references the related JIRA issue ("[FLINK-XXX] Jira
title text")
- The pull request addresses only one issue
- Each commit in the PR has a meaningful commit message (including the
JIRA id)
- [ ] Documentation
- Documentation has been added for new functionality
- Old documentation affected by the pull request has been updated
- JavaDoc for public methods has been added
- [ ] Tests & Build
- Functionality added by the pull request is covered by tests
- `mvn clean verify` has been executed successfully locally or a Travis
build has passed
…t() throws UnknownHostException
- If InetAddress.getLocalHost() throws UnknownHostException when
attempting to connect with LOCAL_HOST strategy, the code will move on
to try the other strategies instead of immediately failing.
- Also made minor code style improvements for trying the different
strategies.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rehevkor5/flink FLINK-4418
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2383.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2383
----
commit fa0f239e1e8fac721b805263ccfaa0e0cfaa6138
Author: Shannon Carey <[email protected]>
Date: 2016-08-18T00:35:49Z
[FLINK-4418] [client] Improve resilience when InetAddress.getLocalHost()
throws UnknownHostException
- If InetAddress.getLocalHost() throws UnknownHostException when
attempting to connect with LOCAL_HOST strategy, the code will move on
to try the other strategies instead of immediately failing.
- Also made minor code style improvements for trying the different
strategies.
----
> ClusterClient/ConnectionUtils#findConnectingAddress fails immediately if
> InetAddress.getLocalHost throws exception
> ------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-4418
> URL: https://issues.apache.org/jira/browse/FLINK-4418
> Project: Flink
> Issue Type: Bug
> Components: Client
> Affects Versions: 1.1.0
> Reporter: Shannon Carey
>
> When attempting to connect to a cluster with a ClusterClient, if the
> machine's hostname is not resolvable to an IP, an exception is thrown
> preventing success.
> This is the case if, for example, the hostname is not present & mapped to a
> local IP in /etc/hosts.
> The exception is below. I suggest that findAddressUsingStrategy() should
> catch java.net.UnknownHostException thrown by InetAddress.getLocalHost() and
> return null, allowing alternative strategies to be attempted by
> findConnectingAddress(). I will open a PR to this effect. Ideally this could
> be included in both 1.2 and 1.1.2.
> In the stack trace below, "ip-10-2-64-47" is the internal host name of an AWS
> EC2 instance.
> {code}
> 21:11:35 org.apache.flink.client.program.ProgramInvocationException: Failed
> to retrieve the JobManager gateway.
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:430)
> 21:11:35 at
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:90)
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389)
> 21:11:35 at
> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75)
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:334)
> 21:11:35 at
> com.expedia.www.flink.job.scheduler.FlinkJobSubmitter.get(FlinkJobSubmitter.java:81)
> 21:11:35 at
> com.expedia.www.flink.job.scheduler.streaming.StreamingJobManager.run(StreamingJobManager.java:105)
> 21:11:35 at
> com.expedia.www.flink.job.scheduler.JobScheduler.runStreamingApp(JobScheduler.java:69)
> 21:11:35 at
> com.expedia.www.flink.job.scheduler.JobScheduler.main(JobScheduler.java:34)
> 21:11:35 Caused by: java.lang.RuntimeException: Failed to resolve JobManager
> address at /10.2.89.80:43126
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:189)
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:649)
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:428)
> 21:11:35 ... 8 more
> 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47:
> ip-10-2-64-47: unknown error
> 21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
> 21:11:35 at
> org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:232)
> 21:11:35 at
> org.apache.flink.runtime.net.ConnectionUtils.findConnectingAddress(ConnectionUtils.java:123)
> 21:11:35 at
> org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:187)
> 21:11:35 ... 10 more
> 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: unknown
> error
> 21:11:35 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> 21:11:35 at
> java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
> 21:11:35 at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
> 21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
> 21:11:35 ... 13 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)