Bruce Schuchardt created GEODE-5393:
---------------------------------------

             Summary: StateFlushOperation hangs waiting for non-existant 
operation to complete
                 Key: GEODE-5393
                 URL: https://issues.apache.org/jira/browse/GEODE-5393
             Project: Geode
          Issue Type: Bug
            Reporter: Bruce Schuchardt


We had a state-flush operation hang with no threads performing flushable 
messaging.  That indicates that there is a book-keeping error in invoking 
startOperation/endOperation.  It looks like it's due to an exception being 
thrown during distribution:

{noformat}
Exception occurred while processing  
DistributedPutAllOperation(EntryEventImpl[op=PUTALL_CREATE;region=/replicate_8;key=null;oldValue=null;newValue=null;callbackArg=null;originRemote=true;originMember=10.32.108.122(accessorgemfire1_rs-StorageBTTest30102851a1i3xlarge-hydra-client-5_15068:15068)<v189>:1033;id=EventID[id=92
 bytes;threadID=97;sequenceID=180]])
org.apache.geode.cache.persistence.PersistentReplicatesOfflineException
        at 
org.apache.geode.internal.cache.DistributedPutAllOperation.initMessage(DistributedPutAllOperation.java:923)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:506)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:264)
        at 
org.apache.geode.internal.cache.DistributedRegion.postPutAllSend(DistributedRegion.java:3214)
        at 
org.apache.geode.internal.cache.LocalRegionDataView.postPutAll(LocalRegionDataView.java:326)
        at 
org.apache.geode.internal.cache.LocalRegion.basicPutAll(LocalRegion.java:9745)
        at 
org.apache.geode.internal.cache.DistributedRegion.basicPutAll(DistributedRegion.java:3240)
        at 
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9493)
        at 
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9505)
        at 
diskRecovery.StartupShutdownTest.HydraTask_doContinuousUpdates(StartupShutdownTest.java:482)
        at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at hydra.MethExecutor.execute(MethExecutor.java:181)
        at hydra.MethExecutor.execute(MethExecutor.java:149)
        at hydra.TestTask.execute(TestTask.java:192)
        at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
{noformat}

This is causing endOperation to not be invoked with the correct view version.  
Error handling was moved from DistributedCacheOperation to other classes but 
it's incorrectly implemented.  The other classes do not know the view version 
and so end up invoking endOperation with an invalid version number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to