Hi Dan/Mike,

This issue was hit on the setup with geode-1.1.0. Taking into account your 
input related to FunctionRemoteContext.fromData, it seems that it is args 
object that seems to cause the issue. Generally for passing an argument to 
function call we are creating an object that has Object[] args array specific 
to the function to be executed along with other fields that are generic for all 
the function executions we will made.

While going over the code I was not able to find race condition while creating 
an argument object for this call. It’s being instantiated in standalone thread 
and the directly passed to Executor service.

Thanks,
Vahram.

From: Michael Stolz [mailto:[email protected]]
Sent: Friday, January 12, 2018 12:41 AM
To: [email protected]
Subject: Re: Function Executor thread stacked

Could this be the thing about passing 1 argument to a function you receive just 
what was passed...passing more than one argument you get an array containing 
the things that you passed?

--
Mike Stolz
Principal Engineer, GemFire Product Lead
Mobile: +1-631-835-4771
Download the new GemFire book 
here.<https://urldefense.proofpoint.com/v2/url?u=https-3A__content.pivotal.io_ebooks_scaling-2Ddata-2Dservices-2Dwith-2Dpivotal-2Dgemfire&d=DwMFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=wpTWSXVvcGFCkFEMePbOecdHHTbyiIj9aWq7oqKb0J8&m=3GzhdGI27IrlCL3fl2LkZ0fILQsdRxAC44v2k1HfwaE&s=6kbFUa__S2UID-64qCgGi-A9s2o9q4C2KwXlnuTbNhA&e=>

On Thu, Jan 11, 2018 at 1:47 PM, Dan Smith 
<[email protected]<mailto:[email protected]>> wrote:
I've seen something like this happen before when there is code that is 
concurrently modifying data that is being serialized. What version of geode are 
you using? The line number in FunctionRemoteContext.fromData should tell us 
which of your objects is failing to be deserialized. For example if you are 
using 1.3 it is the object you are passing as the argument to the function.
I would look closely at your code and make sure nothing could be concurrently 
modifying your function argument or anything that is is referring to while it 
is being serialized.
-Dan

On Thu, Jan 11, 2018 at 12:21 AM, Vahram Aharonyan 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jason,

Basically we have not done modification in function arguments list recently. 
Moreover, this is something that is not persistent in our all deployments of 
the product. And even in this cluster sometimes this function succeeds.
I think the important point here is that exception itself is IOException – it 
seems that data itself that is being deserialized is corrupted. And reason for 
this could be network issue or other infrastructure problems. And even with 
this, still question remains why exception is not being passed back to the 
caller.

Thanks,
Vahram.

From: Jason Huynh [mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, January 11, 2018 3:19 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: Function Executor thread stacked

Hi Vahram,

It would be interesting to know what object is not serializing/deserializing 
correctly.  Is there any chance you are passing in function arguments that have 
had modifications that would impact serialization that the class files on the 
server do not know about?

-Jason

On Wed, Jan 10, 2018 at 5:02 AM Vahram Aharonyan 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,

We are experiencing an issue with the thread that is performing onRegion call 
and expecting some result in response being stacked forewer in TIMED_WAITING 
state with below  trace:

"ComputedAndSystemMetricsRetriever" Id=490 in TIMED_WAITING on 
lock=java.util.concurrent.CountDownLatch$Sync@5630fcc2<mailto:lock=java.util.concurrent.CountDownLatch$Sync@5630fcc2>
Total blocked: 33   Total waited: 261425
  sun.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
  
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
  
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
  java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
  
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64)
  
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:716)
  
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:793)
  
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:769)
  
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:856)
  
org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.waitForCacheOrFunctionException(FunctionStreamingResultCollector.java:438)
  
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:91)
  
platform.gemfire.GemfireFunctionExecutor.onRegion(GemfireFunctionExecutor.java:494)

In the logs of that member we see following:

[warning 2017/12/20 10:49:14.570 UTC 29acc6f1-5384-489d-b2bd-5187b898e482 
<ComputedAndSystemMetricsRetriever> tid=0x1ea] 60 seconds have elapsed while 
waiting for replies: <PRFunctionStreamingResultCollector 100547 waiting for 1 
replies from 
[gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002]> on 
gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 whose 
current membership list is: 
[[gbv00458(8d2960b9-a6be-4519-9547-311e2717231e:15532)<ec><v5>:10002, 
gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002, 
gbv00460(21fd5612-5fe2-451d-aa9d-b8542fa43fa7:20144)<ec><v9>:10002, 
gbv00459(3a14f29a-8bdb-46d5-bb67-0f79cb5c7faa:17197)<ec><v7>:10002, 
gbv00454(18618:locator)<ec><v1>:20002, 
gbv00454(64aed382-0882-44f5-b71f-08a429af46dd:18983)<ec><v8>:10002, 
gbv00453(13656:locator)<ec><v0>:20002, 
gbv00453(881591a8-ae04-4af1-866a-5074c2ffb133:14490)<ec><v2>:10002, 
gbv00456(63cebdf8-dd1e-414e-af5f-f8c4ebecf726:18001)<ec><v6>:10002, 
gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002]]

Near that time on the nodes where this call lands, this exceptions occur:

[severe 2017/12/20 10:48:14.728 UTC abb6648c-39d6-4c4c-9c6d-ab8589e034a5 <P2P 
message reader for 
gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 shared 
unordered uid=8 port=41631> tid=0x44] IOException deserializing message
java.io.IOException: failure during message deserialization
        at 
org.apache.geode.internal.tcp.MsgDestreamer.getMessage(MsgDestreamer.java:190)
        at 
org.apache.geode.internal.tcp.Connection.runOioReader(Connection.java:2218)
        at org.apache.geode.internal.tcp.Connection.run(Connection.java:1728)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.geode.SerializationException: Could not create an 
instance of  
org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage
 .
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492)
        at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:979)
        at 
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2720)
        at 
org.apache.geode.internal.tcp.MsgDestreamer$DestreamerThread.run(MsgDestreamer.java:261)
Caused by: org.apache.geode.SerializationException: Could not create an 
instance of  org.apache.geode.internal.cache.execute.FunctionRemoteContext .
        at 
org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2521)
        at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2958)
        at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897)
        at 
org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.fromData(PartitionedRegionFunctionStreamingMessage.java:180)
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2477)
        ... 3 more
Caused by: org.apache.geode.SerializationException: Could not create an 
instance of  org.apache.geode.internal.cache.execute.FunctionRemoteContext .
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492)
        at 
org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2507)
        ... 7 more
Caused by: 
java.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__java.io&d=DwMFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=wpTWSXVvcGFCkFEMePbOecdHHTbyiIj9aWq7oqKb0J8&m=3GzhdGI27IrlCL3fl2LkZ0fILQsdRxAC44v2k1HfwaE&s=21RMjFpmi1wUjKUTps-XQxh9xNMoYiXP4Dt6O4WWL38&e=>.StreamCorruptedException:
 invalid type code: B1
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1563)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2567)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)
        at java.util.TreeMap.buildFromSorted(TreeMap.java:2508)
        at java.util.TreeMap.readTreeSet(TreeMap.java:2460)
        at java.util.TreeSet.readObject(TreeSet.java:533)
        at sun.reflect.GeneratedMethodAccessor743.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
        at java.util.ArrayList.readObject(ArrayList.java:791)
        at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
       at java.util.ArrayList.readObject(ArrayList.java:791)
        at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1933)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
                    at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2992)
        at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897)
        at 
org.apache.geode.internal.cache.execute.FunctionRemoteContext.fromData(FunctionRemoteContext.java:73)
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2479)
        ... 8 more


So could it be that these exceptions are not being sent back to caller node 
resulting caller thread to wait for reply forever?

Thanks,
Vahram.


Reply via email to