[ 
https://issues.apache.org/jira/browse/GEODE-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mario Ivanac updated GEODE-9075:
--------------------------------
    Description: 
Geode cluster is deployed in kubernetes environment, and Istio/SideCars are 
injected between cluster members. While running traffic, if any Istio/SideCar 
is restarted, thread can get stuck, while waiting for reply on sent message.

 

[warn 2021/03/25 21:04:47.282 CET server2 <ThreadsMonitor> tid=0x12] Thread 
<64> (0x40) that was executed at <25 Mar 2021 21:03:53 CET> has been stuck for 
<53.897 seconds> and number of thread monitor iteration <1> 
 Thread Name <Function Execution Processor2> state <TIMED_WAITING>
 Waiting on <java.util.concurrent.CountDownLatch$Sync@7c7f9898>
 Executor Group <FunctionExecutionPooledExecutor>
 Monitored metric <ResourceManagerStats.numThreadsStuck>
 Thread stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
 
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
 
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:736)
 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:811)
 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:784)
 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:874)
 
org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:811)
 
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:699)
 
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
 
org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
 
org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:520)

...

  was:
Geode cluster is deployed in kubernetes environment, and Istio is injected 
between cluster members. While running traffic, if any istion/sidecar is 
restarted, thread can get stuck, while waiting for reply on sent message.

 

[warn 2021/03/25 21:04:47.282 CET server2 <ThreadsMonitor> tid=0x12] Thread 
<64> (0x40) that was executed at <25 Mar 2021 21:03:53 CET> has been stuck for 
<53.897 seconds> and number of thread monitor iteration <1> 
Thread Name <Function Execution Processor2> state <TIMED_WAITING>
Waiting on <java.util.concurrent.CountDownLatch$Sync@7c7f9898>
Executor Group <FunctionExecutionPooledExecutor>
Monitored metric <ResourceManagerStats.numThreadsStuck>
Thread stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:736)
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:811)
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:784)
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:874)
org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:811)
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:699)
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:520)

...


> Thread stuck indefinitely when using Istio/Sidecar
> --------------------------------------------------
>
>                 Key: GEODE-9075
>                 URL: https://issues.apache.org/jira/browse/GEODE-9075
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Mario Ivanac
>            Assignee: Mario Ivanac
>            Priority: Major
>
> Geode cluster is deployed in kubernetes environment, and Istio/SideCars are 
> injected between cluster members. While running traffic, if any Istio/SideCar 
> is restarted, thread can get stuck, while waiting for reply on sent message.
>  
> [warn 2021/03/25 21:04:47.282 CET server2 <ThreadsMonitor> tid=0x12] Thread 
> <64> (0x40) that was executed at <25 Mar 2021 21:03:53 CET> has been stuck 
> for <53.897 seconds> and number of thread monitor iteration <1> 
>  Thread Name <Function Execution Processor2> state <TIMED_WAITING>
>  Waiting on <java.util.concurrent.CountDownLatch$Sync@7c7f9898>
>  Executor Group <FunctionExecutionPooledExecutor>
>  Monitored metric <ResourceManagerStats.numThreadsStuck>
>  Thread stack:
>  sun.misc.Unsafe.park(Native Method)
>  java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>  
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>  
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>  java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>  
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:736)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:811)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:784)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:874)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:811)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:699)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>  
> org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:520)
> ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to