Jens Deppe created GEODE-5080:
---------------------------------

             Summary: CI Failure: 
ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects
                 Key: GEODE-5080
                 URL: https://issues.apache.org/jira/browse/GEODE-5080
             Project: Geode
          Issue Type: Bug
          Components: gfsh, management
            Reporter: Jens Deppe


This test intermittently fails with with following:
{noformat}
org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest
 > serverRestartsAfterLocatorReconnects FAILED
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.test.dunit.rules.ClusterStartupRule$$Lambda$41/761947362.call 
in VM 3 running on Host b669312074c0 with 5 VMs
        at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:371)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:203)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:196)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:182)
        at 
org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects(ClusterConfigLocatorRestartDUnitTest.java:65)

        Caused by:
        org.apache.geode.GemFireConfigException: Unable to join the distributed 
system.  Operation either timed out, was stopped or Locator does not exist.
{noformat}

The detailed test failure shows the following cause:
{noformat}
Caused by: org.apache.geode.GemFireConfigException: Unable to join the 
distributed system.  Operation either timed out, was stopped or Locator does 
not exist.
        at 
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.join(GMSMembershipManager.java:661)
        at 
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:747)
        at 
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:191)
        at 
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:106)
        at 
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1027)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1061)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:554)
        at 
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:763)
        at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:355)
        at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:343)
        at 
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:335)
        at 
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:211)
        at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:219)
        at 
org.apache.geode.test.junit.rules.ServerStarterRule.startServer(ServerStarterRule.java:172)
        at 
org.apache.geode.test.junit.rules.ServerStarterRule.before(ServerStarterRule.java:78)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.lambda$startServerVM$a2926408$1(ClusterStartupRule.java:212)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at hydra.MethExecutor.executeObject(MethExecutor.java:244)
        at 
org.apache.geode.test.dunit.standalone.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:70)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:361)
        at sun.rmi.transport.Transport$1.run(Transport.java:200)
        at sun.rmi.transport.Transport$1.run(Transport.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
        at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
        at java.security.AccessController.doPrivileged(Native Method)
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
        ... 3 more
{noformat}

The problem is that after the locator is 'crashed' a loop is entered to wait 
for the ClusterConfigurationService to restart. However, sometime this check 
happens too quickly after the crash and the CC still appears to be available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to