Dan Smith created GEODE-4666: -------------------------------- Summary: CI failures in geode examples - Network is unreachable; port (10334) is not available on localhost Key: GEODE-4666 URL: https://issues.apache.org/jira/browse/GEODE-4666 Project: Geode Issue Type: Bug Components: gfsh Reporter: Dan Smith
The geode-examples jobs are sometimes failing with port conflicts. Below is an example https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/TestExamples/builds/24 {noformat} :serialization:start 1. Executing - start locator --name=locator --bind-address=127.0.0.1 ....The Locator process terminated unexpectedly with exit status 1. Please refer to the log file in /tmp/build/ea3e9ea4/geode-examples/serialization/locator for full details. Feb 12, 2018 11:25:19 PM org.apache.geode.distributed.LocatorLauncher failOnStart INFO: locator is exiting due to an exception java.net.BindException: Network is unreachable; port (10334) is not available on localhost. at org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131) at org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635) at org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549) at org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192) Exception in thread "main" java.lang.RuntimeException: An IO error occurred while starting a Locator in /tmp/build/ea3e9ea4/geode-examples/serialization/locator on localhost[10334]: Network is unreachable; port (10334) is not available on localhost. at org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:655) at org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549) at org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192) Caused by: java.net.BindException: Network is unreachable; port (10334) is not available on localhost. at org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131) at org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635) ... 2 more :serialization:start FAILED {noformat} I did some digging, and I think the cause is that the gfsh shutdown command from a previous test has not finished shutting down the locator. Looking at the code, it looks like there is some code that is intended to shutdown and wait for a certain amount of time. But the logic is flawed, because it is executing a function and not waiting for te results. {code} Callable<String> shutdownNodes = () -> { try { Execution execution = FunctionService.onMembers(includeMembers); //****** HERE, execute submits the function asynchronously execution.execute(shutDownFunction); } catch (FunctionException functionEx) { // Expected Exception as the function is shutting down the target members and the result // collector will get member departed exception } return "SUCCESS"; }; Future<String> result = exec.submit(shutdownNodes); result.get(timeout, TimeUnit.MILLISECONDS); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)