[ 
https://issues.apache.org/jira/browse/GEODE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodge updated GEODE-4666:
---------------------------------
    Description: 
The geode-examples jobs are sometimes failing with port conflicts. Below is an 
example

[https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/TestExamples/builds/24]
{noformat}
:serialization:start
1. Executing - start locator --name=locator --bind-address=127.0.0.1
....The Locator process terminated unexpectedly with exit status 1. Please 
refer to the log file in 
/tmp/build/ea3e9ea4/geode-examples/serialization/locator for full details.
Feb 12, 2018 11:25:19 PM org.apache.geode.distributed.LocatorLauncher 
failOnStart
INFO: locator is exiting due to an exception
java.net.BindException: Network is unreachable; port (10334) is not available 
on localhost.
    at 
org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
    at 
org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
    at 
org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
Exception in thread "main" java.lang.RuntimeException: An IO error occurred 
while starting a Locator in 
/tmp/build/ea3e9ea4/geode-examples/serialization/locator on localhost[10334]: 
Network is unreachable; port (10334) is not available on localhost.
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:655)
    at 
org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
    at 
org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
Caused by: java.net.BindException: Network is unreachable; port (10334) is not 
available on localhost.
    at 
org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
    ... 2 more
:serialization:start FAILED
{noformat}
I did some digging, and I think the cause is that the gfsh shutdown command 
from a previous test has not finished shutting down the locator.

Looking at the code, it looks like there is some code that is intended to 
shutdown and wait for a certain amount of time. But the logic is flawed, 
because it is executing a function and not waiting for the results.
{code:java}
      Callable<String> shutdownNodes = () -> {
        try {
          Execution execution = FunctionService.onMembers(includeMembers);
          
           //****** HERE, execute submits the function asynchronously
           execution.execute(shutDownFunction);
        } catch (FunctionException functionEx) {
          // Expected Exception as the function is shutting down the target 
members and the result
          // collector will get member departed exception
        }
        return "SUCCESS";
      };
      Future<String> result = exec.submit(shutdownNodes);
      result.get(timeout, TimeUnit.MILLISECONDS);
{code}

  was:
The geode-examples jobs are sometimes failing with port conflicts. Below is an 
example

https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/TestExamples/builds/24
{noformat}
:serialization:start
1. Executing - start locator --name=locator --bind-address=127.0.0.1
....The Locator process terminated unexpectedly with exit status 1. Please 
refer to the log file in 
/tmp/build/ea3e9ea4/geode-examples/serialization/locator for full details.
Feb 12, 2018 11:25:19 PM org.apache.geode.distributed.LocatorLauncher 
failOnStart
INFO: locator is exiting due to an exception
java.net.BindException: Network is unreachable; port (10334) is not available 
on localhost.
    at 
org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
    at 
org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
    at 
org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
Exception in thread "main" java.lang.RuntimeException: An IO error occurred 
while starting a Locator in 
/tmp/build/ea3e9ea4/geode-examples/serialization/locator on localhost[10334]: 
Network is unreachable; port (10334) is not available on localhost.
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:655)
    at 
org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
    at 
org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
Caused by: java.net.BindException: Network is unreachable; port (10334) is not 
available on localhost.
    at 
org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
    at 
org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
    ... 2 more
:serialization:start FAILED
{noformat}

I did some digging, and I think the cause is that the gfsh shutdown command 
from a previous test has not finished shutting down the locator.

Looking at the code, it looks like there is some code that is intended to 
shutdown and wait for a certain amount of time. But the logic is flawed, 
because it is executing a function and not waiting for te results.

{code}
      Callable<String> shutdownNodes = () -> {
        try {
          Execution execution = FunctionService.onMembers(includeMembers);
          
           //****** HERE, execute submits the function asynchronously
           execution.execute(shutDownFunction);
        } catch (FunctionException functionEx) {
          // Expected Exception as the function is shutting down the target 
members and the result
          // collector will get member departed exception
        }
        return "SUCCESS";
      };
      Future<String> result = exec.submit(shutdownNodes);
      result.get(timeout, TimeUnit.MILLISECONDS);
{code}


> CI failures in geode examples - Network is unreachable; port (10334) is not 
> available on localhost
> --------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-4666
>                 URL: https://issues.apache.org/jira/browse/GEODE-4666
>             Project: Geode
>          Issue Type: Bug
>          Components: gfsh
>            Reporter: Dan Smith
>            Priority: Major
>
> The geode-examples jobs are sometimes failing with port conflicts. Below is 
> an example
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/TestExamples/builds/24]
> {noformat}
> :serialization:start
> 1. Executing - start locator --name=locator --bind-address=127.0.0.1
> ....The Locator process terminated unexpectedly with exit status 1. Please 
> refer to the log file in 
> /tmp/build/ea3e9ea4/geode-examples/serialization/locator for full details.
> Feb 12, 2018 11:25:19 PM org.apache.geode.distributed.LocatorLauncher 
> failOnStart
> INFO: locator is exiting due to an exception
> java.net.BindException: Network is unreachable; port (10334) is not available 
> on localhost.
>     at 
> org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
>     at 
> org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
>     at 
> org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
>     at 
> org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
> Exception in thread "main" java.lang.RuntimeException: An IO error occurred 
> while starting a Locator in 
> /tmp/build/ea3e9ea4/geode-examples/serialization/locator on localhost[10334]: 
> Network is unreachable; port (10334) is not available on localhost.
>     at 
> org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:655)
>     at 
> org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:549)
>     at 
> org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:192)
> Caused by: java.net.BindException: Network is unreachable; port (10334) is 
> not available on localhost.
>     at 
> org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:131)
>     at 
> org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:635)
>     ... 2 more
> :serialization:start FAILED
> {noformat}
> I did some digging, and I think the cause is that the gfsh shutdown command 
> from a previous test has not finished shutting down the locator.
> Looking at the code, it looks like there is some code that is intended to 
> shutdown and wait for a certain amount of time. But the logic is flawed, 
> because it is executing a function and not waiting for the results.
> {code:java}
>       Callable<String> shutdownNodes = () -> {
>         try {
>           Execution execution = FunctionService.onMembers(includeMembers);
>           
>            //****** HERE, execute submits the function asynchronously
>            execution.execute(shutDownFunction);
>         } catch (FunctionException functionEx) {
>           // Expected Exception as the function is shutting down the target 
> members and the result
>           // collector will get member departed exception
>         }
>         return "SUCCESS";
>       };
>       Future<String> result = exec.submit(shutdownNodes);
>       result.get(timeout, TimeUnit.MILLISECONDS);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to