Hi Jason, Basically we have not done modification in function arguments list recently. Moreover, this is something that is not persistent in our all deployments of the product. And even in this cluster sometimes this function succeeds. I think the important point here is that exception itself is IOException – it seems that data itself that is being deserialized is corrupted. And reason for this could be network issue or other infrastructure problems. And even with this, still question remains why exception is not being passed back to the caller.
Thanks, Vahram. From: Jason Huynh [mailto:[email protected]] Sent: Thursday, January 11, 2018 3:19 AM To: [email protected] Subject: Re: Function Executor thread stacked Hi Vahram, It would be interesting to know what object is not serializing/deserializing correctly. Is there any chance you are passing in function arguments that have had modifications that would impact serialization that the class files on the server do not know about? -Jason On Wed, Jan 10, 2018 at 5:02 AM Vahram Aharonyan <[email protected]<mailto:[email protected]>> wrote: Hi All, We are experiencing an issue with the thread that is performing onRegion call and expecting some result in response being stacked forewer in TIMED_WAITING state with below trace: "ComputedAndSystemMetricsRetriever" Id=490 in TIMED_WAITING on lock=java.util.concurrent.CountDownLatch$Sync@5630fcc2<mailto:lock=java.util.concurrent.CountDownLatch$Sync@5630fcc2> Total blocked: 33 Total waited: 261425 sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64) org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:716) org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:793) org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:769) org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:856) org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.waitForCacheOrFunctionException(FunctionStreamingResultCollector.java:438) org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:91) platform.gemfire.GemfireFunctionExecutor.onRegion(GemfireFunctionExecutor.java:494) In the logs of that member we see following: [warning 2017/12/20 10:49:14.570 UTC 29acc6f1-5384-489d-b2bd-5187b898e482 <ComputedAndSystemMetricsRetriever> tid=0x1ea] 60 seconds have elapsed while waiting for replies: <PRFunctionStreamingResultCollector 100547 waiting for 1 replies from [gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002]> on gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 whose current membership list is: [[gbv00458(8d2960b9-a6be-4519-9547-311e2717231e:15532)<ec><v5>:10002, gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002, gbv00460(21fd5612-5fe2-451d-aa9d-b8542fa43fa7:20144)<ec><v9>:10002, gbv00459(3a14f29a-8bdb-46d5-bb67-0f79cb5c7faa:17197)<ec><v7>:10002, gbv00454(18618:locator)<ec><v1>:20002, gbv00454(64aed382-0882-44f5-b71f-08a429af46dd:18983)<ec><v8>:10002, gbv00453(13656:locator)<ec><v0>:20002, gbv00453(881591a8-ae04-4af1-866a-5074c2ffb133:14490)<ec><v2>:10002, gbv00456(63cebdf8-dd1e-414e-af5f-f8c4ebecf726:18001)<ec><v6>:10002, gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002]] Near that time on the nodes where this call lands, this exceptions occur: [severe 2017/12/20 10:48:14.728 UTC abb6648c-39d6-4c4c-9c6d-ab8589e034a5 <P2P message reader for gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 shared unordered uid=8 port=41631> tid=0x44] IOException deserializing message java.io.IOException: failure during message deserialization at org.apache.geode.internal.tcp.MsgDestreamer.getMessage(MsgDestreamer.java:190) at org.apache.geode.internal.tcp.Connection.runOioReader(Connection.java:2218) at org.apache.geode.internal.tcp.Connection.run(Connection.java:1728) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage . at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492) at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:979) at org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2720) at org.apache.geode.internal.tcp.MsgDestreamer$DestreamerThread.run(MsgDestreamer.java:261) Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.execute.FunctionRemoteContext . at org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2521) at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2958) at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897) at org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.fromData(PartitionedRegionFunctionStreamingMessage.java:180) at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2477) ... 3 more Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.execute.FunctionRemoteContext . at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492) at org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2507) ... 7 more Caused by: java.io.StreamCorruptedException: invalid type code: B1 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1563) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at java.util.TreeMap.buildFromSorted(TreeMap.java:2567) at java.util.TreeMap.buildFromSorted(TreeMap.java:2551) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2551) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2551) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2551) at java.util.TreeMap.buildFromSorted(TreeMap.java:2583) at java.util.TreeMap.buildFromSorted(TreeMap.java:2508) at java.util.TreeMap.readTreeSet(TreeMap.java:2460) at java.util.TreeSet.readObject(TreeSet.java:533) at sun.reflect.GeneratedMethodAccessor743.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at java.util.ArrayList.readObject(ArrayList.java:791) at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at java.util.ArrayList.readObject(ArrayList.java:791) at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1933) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2992) at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897) at org.apache.geode.internal.cache.execute.FunctionRemoteContext.fromData(FunctionRemoteContext.java:73) at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2479) ... 8 more So could it be that these exceptions are not being sent back to caller node resulting caller thread to wait for reply forever? Thanks, Vahram.
