[ 
https://issues.apache.org/jira/browse/GEODE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421069#comment-17421069
 ] 

Dale Emery commented on GEODE-9644:
-----------------------------------

It looks as if the test starts the locator on an ephemeral port, then expects 
it to reconnect on that port after being force disconnected.

It is not safe to expect an ephemeral port to remain available after 
disconnecting from it.

To fix this, change the test to use {{AvailablePortHelper}} to assign a port, 
then use the assigned port to start the locator.

> ClusterConfigLocatorRestartDUnitTest > serverRestartsAfterLocatorReconnects 
> FAILED
> ----------------------------------------------------------------------------------
>
>                 Key: GEODE-9644
>                 URL: https://issues.apache.org/jira/browse/GEODE-9644
>             Project: Geode
>          Issue Type: Bug
>          Components: membership, messaging
>            Reporter: Nabarun Nag
>            Priority: Major
>              Labels: needsTriage
>
> There is sequence of cascading exceptions that occur in the VMs and we need a 
> more detailed investigation: Possible culprit may be the bind address in use 
> exeception:
> [*http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0513/test-results/distributedTest/1632566050/*]
>  
> {noformat}
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest
>  > serverRestartsAfterLocatorReconnects FAILED
>     org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest$$Lambda$163/1450009752.run
>  in VM 0 running on Host 
> heavy-lifter-2c03c48d-8a0c-58ae-bad6-31f64bb5400a.c.apachegeode-ci.internal 
> with 5 VMs
>         at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
>         at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
>         at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:94)
>         at 
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.waitForLocatorToReconnect(ClusterConfigLocatorRestartDUnitTest.java:225)
>         at 
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects(ClusterConfigLocatorRestartDUnitTest.java:90)
>         Caused by:
>         org.awaitility.core.ConditionTimeoutException: Condition with 
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest
>  was not fulfilled within 5 minutes.
>             at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
>             at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:78)
>             at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:26)
>             at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
>             at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:908)
>             at 
> org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.lambda$waitForLocatorToReconnect$f182e747$2(ClusterConfigLocatorRestartDUnitTest.java:226)
> 8300 tests completed, 1 failed, 413 skipped{noformat}
> VM2 mentions that cluster membership has failed.
> {noformat}
> [vm2] [info 2021/09/25 09:36:44.487 UTC server-2 <DisconnectThread> 
> tid=0x153] cluster membership failed due to 
> [vm2] 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  for testing
> [vm2]         at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:1787)
> [vm2]         at 
> org.apache.geode.distributed.internal.membership.api.MembershipManagerHelper.crashDistributedSystem(MembershipManagerHelper.java:139)
> [vm2]         at 
> org.apache.geode.test.junit.rules.MemberStarterRule.forceDisconnectMember(MemberStarterRule.java:568)
> [vm2]         at 
> org.apache.geode.test.dunit.rules.MemberVM.lambda$forceDisconnect$bb17a952$1(MemberVM.java:90)
> [vm2]         at 
> org.apache.geode.test.dunit.internal.IdentifiableRunnable.run(IdentifiableRunnable.java:41)
> [vm2]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [vm2]         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [vm2]         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [vm2]         at java.lang.reflect.Method.invoke(Method.java:498)
> [vm2]         at 
> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
> [vm2]         at 
> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
> [vm2]         at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source)
> [vm2]         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [vm2]         at java.lang.reflect.Method.invoke(Method.java:498)
> [vm2]         at 
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
> [vm2]         at sun.rmi.transport.Transport$1.run(Transport.java:200)
> [vm2]         at sun.rmi.transport.Transport$1.run(Transport.java:197)
> [vm2]         at java.security.AccessController.doPrivileged(Native Method)
> [vm2]         at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> [vm2]         at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
> [vm2]         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
> [vm2]         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
> [vm2]         at java.security.AccessController.doPrivileged(Native Method)
> [vm2]         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
> [vm2]         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [vm2]         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [vm2]         at java.lang.Thread.run(Thread.java:748){noformat}
>  
> VM0 hits address already in use. 
> {noformat}
> [vm0] [info 2021/09/25 09:36:53.762 UTC locator-0 <Location services restart 
> thread> tid=0x60] attempt to restart location services terminated
> [vm0] java.net.BindException: Failed to create server socket on 
> 0.0.0.0/0.0.0.0[43091]
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75)
> [vm0]         at 
> org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55)
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:54)
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.initializeServerSocket(TcpServer.java:196)
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.startServerThread(TcpServer.java:183)
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.restarting(TcpServer.java:161)
> [vm0]         at 
> org.apache.geode.distributed.internal.membership.gms.locator.MembershipLocatorImpl.restarting(MembershipLocatorImpl.java:144)
> [vm0]         at 
> org.apache.geode.distributed.internal.InternalLocator.restartWithoutSystem(InternalLocator.java:1200)
> [vm0]         at 
> org.apache.geode.distributed.internal.InternalLocator.attemptReconnect(InternalLocator.java:1144)
> [vm0]         at 
> org.apache.geode.distributed.internal.InternalLocator.lambda$launchRestartThread$4(InternalLocator.java:1077)
> [vm0]         at java.lang.Thread.run(Thread.java:748)
> [vm0] Caused by: java.net.BindException: Address already in use (Bind failed)
> [vm0]         at java.net.PlainSocketImpl.socketBind(Native Method)
> [vm0]         at 
> java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
> [vm0]         at java.net.ServerSocket.bind(ServerSocket.java:390)
> [vm0]         at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:72)
> [vm0]         ... 10 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to