[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-10-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652352#comment-16652352
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

zentol closed pull request #5449: [FLINK-8623] 
ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuri…
URL: https://github.com/apache/flink/pull/5449
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
 
b/flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
index d7c4baae4d1..5ec3f443527 100644
--- 
a/flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
+++ 
b/flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
@@ -18,6 +18,7 @@
 
 package org.apache.flink.runtime.net;
 
+import org.junit.Assume;
 import org.junit.Test;
 import org.junit.runner.RunWith;
 import org.mockito.Mockito;
@@ -26,11 +27,13 @@
 import org.powermock.modules.junit4.PowerMockRunner;
 
 import java.io.IOException;
-import java.net.Inet4Address;
 import java.net.InetAddress;
 import java.net.InetSocketAddress;
 import java.net.ServerSocket;
+import java.net.Socket;
+import java.net.SocketAddress;
 import java.net.UnknownHostException;
+import java.net.Inet4Address;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertNotNull;
@@ -53,13 +56,28 @@ public void testReturnLocalHostAddressUsingHeuristics() 
throws Exception {
final long start = System.nanoTime();
InetAddress add = 
ConnectionUtils.findConnectingAddress(unreachable, 2000, 400);
 
+   // we should have found a heuristic address
+   assertNotNull(add);
+
+   // make sure that the InetAddress.getLocalHost is not a 
loopback address
+   
Assume.assumeFalse(InetAddress.getLocalHost().isLoopbackAddress());
+
+   // make sure the connection address is not a loopback 
address
+   Assume.assumeFalse(add.isLoopbackAddress());
+
+   Socket socket = new Socket();
+
+   SocketAddress socketAddress = new 
InetSocketAddress(add, 0);
+
+   socket.bind(socketAddress);
+
+   // check whether can bind to this address
+   assertTrue(socket.isBound());
+
// check that it did not take forever (max 30 seconds)
// this check can unfortunately not be too tight, or it 
will be flaky on some CI infrastructure
assertTrue(System.nanoTime() - start < 30_000_000_000L);
 
-   // we should have found a heuristic address
-   assertNotNull(add);
-
// make sure that we returned the 
InetAddress.getLocalHost as a heuristic
assertEquals(InetAddress.getLocalHost(), add);
}


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-10-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652351#comment-16652351
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

zentol commented on issue #5449: [FLINK-8623] 
ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuri…
URL: https://github.com/apache/flink/pull/5449#issuecomment-430376913
 
 
   Subsumed in FLINK-4052.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-29 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419078#comment-16419078
 ] 

Till Rohrmann commented on FLINK-8623:
--

Unblocking 1.5.0 from this issue since it seems to exist for longer.

> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409363#comment-16409363
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
I copied the Travis log here, and I think this is what we want. And the 
line of 63 verifies it is not a loopback address. 

```
// make sure that the InetAddress.getLocalHost is not a loopback address
Assume.assumeFalse(InetAddress.getLocalHost().isLoopbackAddress());
```

```
Tests in error: 
  ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:63 » 
AssumptionViolated

Tests run: 2553, Failures: 0, Errors: 1, Skipped: 8

17:38:48.959 [INFO] 

17:38:48.959 [INFO] Reactor Summary:
17:38:48.959 [INFO] 
17:38:48.959 [INFO] flink-core . 
SUCCESS [ 55.118 s]
17:38:48.959 [INFO] flink-java . 
SUCCESS [ 26.140 s]
17:38:48.959 [INFO] flink-runtime .. 
FAILURE [11:01 min]
17:38:48.959 [INFO] flink-optimizer  
SKIPPED
17:38:48.959 [INFO] flink-clients .. 
SKIPPED
17:38:48.959 [INFO] flink-streaming-java ... 
SKIPPED
17:38:48.962 [INFO] flink-scala  
SKIPPED
17:38:48.962 [INFO] flink-test-utils ... 
SKIPPED```


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408069#comment-16408069
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Hi, @tillrohrmann  I push a new version of code based on your suggestions. 
Could you please take a look when available ? Thanks.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406344#comment-16406344
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Thanks @tillrohrmann . I agree with you. This will produce a smaller 
refactoring without affecting the system itself. I will try soon.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406320#comment-16406320
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/5449
  
@zhangminglei, I think Stephan's idea was to not change any parts of the 
`ConnectionUtils`. Instead we should harden the 
`ConnectionUtilsTest#testReturnLocalHostAddressUsingHeuristics` by adding a 
`Assume.assumeFalse(InetAddress.getLocalHost().isLoopbackAddress)` and checking 
that this is also false for the returned address. Moreover, we could check 
whether we can actually bind to this address.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396383#comment-16396383
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Hi, @NicoK . I think ```InetAddress.getAllByName("localhost")``` wont work 
since we still give the specific hostname for that. And it will return a  
loopback address.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395208#comment-16395208
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Thanks @StephanEwen and @NicoK for the response and that let me know more 
about details about this issue.

As refers to the first that return something we can bind to. Yes, As long 
as it is not the loopback address and I think it is okay for that. So, What do 
you think of if we add a check like the below, at this moment, I do not add a 
retry number here. Just give an example for hack this. But this might be cause 
an endless loop.

```
// our attempts timed out. use the heuristic fallback
LOG.warn("Could not connect to {}. Selecting a local address 
using heuristics.", targetAddress);
InetAddress heuristic = 
findAddressUsingStrategy(AddressDetectionState.HEURISTIC, targetAddress, true);
if (heuristic != null) {
while (heuristic.toString().contains("127.0.0.1") || 
heuristic.toString().contains("127.0.1.1")) {
heuristic = 
findAddressUsingStrategy(AddressDetectionState.HEURISTIC, targetAddress, true);
}
return heuristic;
}
else {
LOG.warn("Could not find any IPv4 address that is not 
loopback or link-local. Using localhost address.");
InetAddress address = InetAddress.getLocalHost();
while (address.toString().contains("127.0.0.1") || 
address.toString().contains("127.0.1.1")) {
address = InetAddress.getLocalHost();
}
return address;
}
```

@StephanEwen As refers the second , I think we can use the following code 
to find a stuff to bind to for the test. @NicoK  Could you also take a look of 
those ? Thanks  ~

```
 Enumeration nifs = 
NetworkInterface.getNetworkInterfaces();
while (nifs.hasMoreElements()) {
  NetworkInterface nif = nifs.nextElement();
  Enumeration addresses = nif.getInetAddresses();
  while (addresses.hasMoreElements()) {
InetAddress addr = addresses.nextElement();
if (!(addr.getHostAddress().contains("127.0.0.1") || 
addr.getHostAddress().contains("127.0.1.1"))) {
  return addr; // this is the stuff we bind to 
}
  }
}
  }
```


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395097#comment-16395097
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/5449
  
We cannot fully guard against incorrect network configurations.

Many systems (even HDFS,, AFAIK) simply use `InetAddress.getLocalHost()` 
and rely on correct network configs. Our current strategy already tries to be 
smarter by trying to fond a connecting address, and only fall back to 
`getLocalHost()`.

My feeling is that the strategy is fine (I have not seen connectivity 
issues in a while) and focus on stabilizing the tests.
  - We could reducing the heuristic check to "returns something we can bind 
to"
  - We could add an assumption to the test that `getLocalHost()` is not a 
loopback address and something we can bind to, and then run the tests.

What do you think?


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395066#comment-16395066
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/5449
  
Thanks @StephanEwen indeed - I also did not grasp the full intend of the 
change.

Looking at the code again, basically 
`ConnectionUtils#findConnectingAddress()` tries to connect with a set of 
strategies for a given time and if that passes, it will fall back to the 
heuristic. We could argue that if we do not find an interface to connect to the 
target address in a given time, we may not do too much about it rather than (a) 
failing now, (b) retrying forever, or (c) trying some heuristic that may work 
or fail later. The latter is implemented and seems sensible - we also print 
warnings to the log in that case.

For the unit test, however, I think, the actual check should be reduced to 
"does return something and does not block" since according to the code in 
`ConnectionUtils` we wouldn't verify the heuristic was successful anyway:
```
// our attempts timed out. use the heuristic fallback
LOG.warn("Could not connect to {}. Selecting a local address 
using heuristics.", targetAddress);
InetAddress heuristic = 
findAddressUsingStrategy(AddressDetectionState.HEURISTIC, targetAddress, true);
if (heuristic != null) {
return heuristic;
}
else {
LOG.warn("Could not find any IPv4 address that is not 
loopback or link-local. Using localhost address.");
return InetAddress.getLocalHost();
}
```
We could maybe cycle through `InetAddress.getAllByName("localhost")` and 
verify the returned value is in there - unless I read this wrong - but I don't 
know if that check is of any actual value, to be honest.

If we wanted to test for more, the only outside sign of the heuristic 
strategy being used is to look at the logs.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395008#comment-16395008
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
By the way, Although I have a Linux computer, I cannot access the Internet. 
:) I just use ```ifconfig``` check that.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394999#comment-16394999
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
I will verify whether retry can hack this issue. Not sure about it.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394996#comment-16394996
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Thank you for your response, @StephanEwen You are very correct. We would 
not connect JM / RM / ZK if we use ```getByName("localhost")``` . 

I will give an example based I have known about this issue.

We still can't believe ```getLocalHost()``` too much , as the file 
```/etc/hosts``` once change the mapping of host name to an IP. That will cause 
a problem after that. And we will get different value( that you set the mapping 
including this value)

Below is my machine about this file, and I can always get a correct stuff 
of ```ricezhang-pjhzf.vclound.com/10.199.203.242``` that works for my network 
if I do not set that manually. 

```
127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 
localhost6.localdomain6

```

Once I modify that file like below though, I will always get an 
**incorrect** value, because ```192.168.1.1``` is a a fake address. and will 
cause some issue.

```
127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 
localhost6.localdomain6
192.168.1.1 ricezhang-pjhzf.vclound.com

```

As refers to this **instability** test, **I don't know if something 
dynamically modified that file on Travis** ? Or something like that ? Even if 
not, we also CAN NOT rely on this method to get a IP that can connect to. Once 
the ```Zookeeper``` records the incorrect IP address, this will cause error. We 
can not get a correct IP to connect to what we want.

I agree with retry or a better test ( I do not know at this moment ) 
approach, I will study this issue in depth in these days.



> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392849#comment-16392849
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/5449
  
This breaks the expected behavior, unfortunately.

The logic is as follows:
  - try to connect to JM / RM / ZK and use an interface that you can 
connect from, if possible
  - if that does not work, fall back to your default interface 
(`getLocalHost()`) - the heuristic

If we change step 2 to use `getByName("localhost")` it gives us the 
loopback interface which no one external can connect to. That would basically 
mean that TM is isolated for any data communication.

The change to `ConnectionUtils` cannot be made like this. Instead, I would 
suggest to look at why the instability happens and fix the test instead (retry 
or a better test approach).


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388909#comment-16388909
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Thanks @NicoK . Use this patch pushed here at least two times and Travis CI 
verifies without error for this.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384550#comment-16384550
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/5449
  
Sorry, @NicoK I tried the ```NULL``` situation. It seems wont work.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384451#comment-16384451
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5449#discussion_r172004051
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
 ---
@@ -61,7 +61,7 @@ public void testReturnLocalHostAddressUsingHeuristics() 
throws Exception {
assertNotNull(add);
 
// make sure that we returned the 
InetAddress.getLocalHost as a heuristic
-   assertEquals(InetAddress.getLocalHost(), add);
+   assertEquals(InetAddress.getByName("localhost"), add);
--- End diff --

Or, we can also still use ```InetAddress.getByName("localhost")```. Both 
```null``` and ```InetAddress.getByName("localhost")``` are OKay.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384450#comment-16384450
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/5449#discussion_r172004016
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
 ---
@@ -61,7 +61,7 @@ public void testReturnLocalHostAddressUsingHeuristics() 
throws Exception {
assertNotNull(add);
 
// make sure that we returned the 
InetAddress.getLocalHost as a heuristic
-   assertEquals(InetAddress.getLocalHost(), add);
+   assertEquals(InetAddress.getByName("localhost"), add);
--- End diff --

Thanks @NicoK review. Yes. I think this is a better way to use ```null``` 
instead of ```InetAddress.getLocalHost()``` to prevent multiple adapters cache 
expires from getting a  another result in the future. I will give a quick fix 
for that.


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383644#comment-16383644
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/5449#discussion_r171858504
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/net/ConnectionUtilsTest.java
 ---
@@ -61,7 +61,7 @@ public void testReturnLocalHostAddressUsingHeuristics() 
throws Exception {
assertNotNull(add);
 
// make sure that we returned the 
InetAddress.getLocalHost as a heuristic
-   assertEquals(InetAddress.getLocalHost(), add);
+   assertEquals(InetAddress.getByName("localhost"), add);
--- End diff --

How about also changing the blocker above to accept connections on any 
interface just to rule accidental successful resolve attempts out, too, i.e. 
using `ServerSocket blocker = new ServerSocket(0, 1, null)`?


> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0, 1.4.3
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-02-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359307#comment-16359307
 ] 

ASF GitHub Bot commented on FLINK-8623:
---

GitHub user zhangminglei opened a pull request:

https://github.com/apache/flink/pull/5449

[FLINK-8623] ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuri…

## What is the purpose of the change
Try to fix ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics 
unstable on Travis. 

## Brief change log
Change the behavior get localhost ```InetAddress``` to 
```InetAddress.getByName("localhost")``` instead of 
```InetAddress.getLocalHost()```, as the latter one will return the actual IP 
of one of your network adapters.
And in general ```InetAddress.getLocalHost()``` should be avoided.

## Verifying this change
This change is already covered by existing tests, is 
```testReturnLocalHostAddressUsingHeuristics``` in 
```ConnectionUtilsTest.java```.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no )
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangminglei/flink flink-8623

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5449.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5449


commit 67e70d558c7d2f47ed3f7a407ad250518a2f683c
Author: zhangminglei 
Date:   2018-02-10T07:16:09Z

[FLINK-8623] ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics 
unstable on Travis




> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.5.0
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-02-09 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359303#comment-16359303
 ] 

mingleizhang commented on FLINK-8623:
-

{\{getLocalHost()}} returns the actual IP of one of our network adapters. So, 
\{{InetAddress.getByName('localhost')}}  should give the IP address that we 
expect. I will give a PR and watch what will happened with the new code.

> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.5.0
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8623) ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on Travis

2018-02-09 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359289#comment-16359289
 ] 

mingleizhang commented on FLINK-8623:
-

Hi, [~till.rohrmann] The expected address is 127.0.1.1, but the actual value is 
127.0.0.1. Actually, they are both to loopback address. So, I think this issue 
was caused by CI servers which based on different /etc/hosts file. What do you 
think ? Thanks.

> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics unstable on 
> Travis
> 
>
> Key: FLINK-8623
> URL: https://issues.apache.org/jira/browse/FLINK-8623
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.5.0
>
>
> {{ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics}} fails on 
> Travis: https://travis-ci.org/apache/flink/jobs/33932



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)