Bruce Schuchardt created GEODE-5393:
---------------------------------------
Summary: StateFlushOperation hangs waiting for non-existant
operation to complete
Key: GEODE-5393
URL: https://issues.apache.org/jira/browse/GEODE-5393
Project: Geode
Issue Type: Bug
Reporter: Bruce Schuchardt
We had a state-flush operation hang with no threads performing flushable
messaging. That indicates that there is a book-keeping error in invoking
startOperation/endOperation. It looks like it's due to an exception being
thrown during distribution:
{noformat}
Exception occurred while processing
DistributedPutAllOperation(EntryEventImpl[op=PUTALL_CREATE;region=/replicate_8;key=null;oldValue=null;newValue=null;callbackArg=null;originRemote=true;originMember=10.32.108.122(accessorgemfire1_rs-StorageBTTest30102851a1i3xlarge-hydra-client-5_15068:15068)<v189>:1033;id=EventID[id=92
bytes;threadID=97;sequenceID=180]])
org.apache.geode.cache.persistence.PersistentReplicatesOfflineException
at
org.apache.geode.internal.cache.DistributedPutAllOperation.initMessage(DistributedPutAllOperation.java:923)
at
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:506)
at
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:264)
at
org.apache.geode.internal.cache.DistributedRegion.postPutAllSend(DistributedRegion.java:3214)
at
org.apache.geode.internal.cache.LocalRegionDataView.postPutAll(LocalRegionDataView.java:326)
at
org.apache.geode.internal.cache.LocalRegion.basicPutAll(LocalRegion.java:9745)
at
org.apache.geode.internal.cache.DistributedRegion.basicPutAll(DistributedRegion.java:3240)
at
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9493)
at
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9505)
at
diskRecovery.StartupShutdownTest.HydraTask_doContinuousUpdates(StartupShutdownTest.java:482)
at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.execute(MethExecutor.java:181)
at hydra.MethExecutor.execute(MethExecutor.java:149)
at hydra.TestTask.execute(TestTask.java:192)
at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212)
{noformat}
This is causing endOperation to not be invoked with the correct view version.
Error handling was moved from DistributedCacheOperation to other classes but
it's incorrectly implemented. The other classes do not know the view version
and so end up invoking endOperation with an invalid version number.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)