Jens Deppe created GEODE-5080: --------------------------------- Summary: CI Failure: ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects Key: GEODE-5080 URL: https://issues.apache.org/jira/browse/GEODE-5080 Project: Geode Issue Type: Bug Components: gfsh, management Reporter: Jens Deppe
This test intermittently fails with with following: {noformat} org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest > serverRestartsAfterLocatorReconnects FAILED org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.test.dunit.rules.ClusterStartupRule$$Lambda$41/761947362.call in VM 3 running on Host b669312074c0 with 5 VMs at org.apache.geode.test.dunit.VM.invoke(VM.java:436) at org.apache.geode.test.dunit.VM.invoke(VM.java:405) at org.apache.geode.test.dunit.VM.invoke(VM.java:371) at org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:203) at org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:196) at org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:182) at org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects(ClusterConfigLocatorRestartDUnitTest.java:65) Caused by: org.apache.geode.GemFireConfigException: Unable to join the distributed system. Operation either timed out, was stopped or Locator does not exist. {noformat} The detailed test failure shows the following cause: {noformat} Caused by: org.apache.geode.GemFireConfigException: Unable to join the distributed system. Operation either timed out, was stopped or Locator does not exist. at org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.join(GMSMembershipManager.java:661) at org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:747) at org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:191) at org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:106) at org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90) at org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1027) at org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1061) at org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:554) at org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:763) at org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:355) at org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:343) at org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:335) at org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:211) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:219) at org.apache.geode.test.junit.rules.ServerStarterRule.startServer(ServerStarterRule.java:172) at org.apache.geode.test.junit.rules.ServerStarterRule.before(ServerStarterRule.java:78) at org.apache.geode.test.dunit.rules.ClusterStartupRule.lambda$startServerVM$a2926408$1(ClusterStartupRule.java:212) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hydra.MethExecutor.executeObject(MethExecutor.java:244) at org.apache.geode.test.dunit.standalone.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:361) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) ... 3 more {noformat} The problem is that after the locator is 'crashed' a loop is entered to wait for the ClusterConfigurationService to restart. However, sometime this check happens too quickly after the crash and the CC still appears to be available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)