Hi,

We have a cluster with 5 server nodes. We run Ignite nodes dockerized. We do 
"docker stop" on one node to trigger a fail-over. At all this time we send 
operations to the cluster during a stress test.

As soon as we stop one of the 5 nodes, all the other nodes stop processing 
requests done with compute.call. In a thread dump we can see they all have 
blocked threads in the state below: they are blocked in a syncOp (it is a 
putIfAbsent), and they do not recover from this (at least we waited 10 minutes 
and they are blocked there, and even print "Threads starvation" messages when 
enough requests are made).

Please note that if we have 6 nodes we no longer see this issue in a new test.

Here is a blocked thread dump:

"pub-#4%glueGrid%" #21 prio=5 os_prio=0 tid=0x00007ff30c4a0800 nid=0x23 waiting 
on condition [0x00007ff2a86e4000]
   java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for <0x00000000f5529918> (a 
org.apache.ignite.internal.util.future.GridFutureAdapter$ChainFuture)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
 at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:155)
 at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:115)
 at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter$36.op(GridCacheAdapter.java:2642)
 at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter$36.op(GridCacheAdapter.java:2640)
 at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4440)
 at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.putIfAbsent(GridCacheAdapter.java:2640)
 at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.putIfAbsent(IgniteCacheProxy.java:1220)
 at comp.journeyprocessing.repository.Repository.persist(Repository.java:32)
 at 
comp.journeyprocessing.RequestProcessor.storeCorrelationIdToIndicateRequestIsHandled(RequestProcessor.java:61)
 at comp.journeyprocessing.RequestProcessor.process(RequestProcessor.java:51)
 at 
comp.journeyprocessing.RequestProcessor$$FastClassBySpringCGLIB$$3398aa.invoke(<generated>)
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
 at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
 at 
org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)
 at 
org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281)
 at 
org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
 at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
 at 
comp.journeyprocessing.RequestProcessor$$EnhancerBySpringCGLIB$$27ed152b.process(<generated>)
 at 
comp.journeyprocessing.RequestManagerImpl.processRequest(RequestManagerImpl.java:67)
 at comp.journeyprocessing.RequestManagerImpl.handle(RequestManagerImpl.java:59)
 at 
comp.journeyprocessingclient.api.TheGlueRequestCallable.execute(TheGlueRequestCallable.java:15)
 at 
comp.journeyprocessingclient.api.TheGlueRequestCallable.execute(TheGlueRequestCallable.java:5)
 at comp.journeyprocessingclient.TheGlueCallable.call(TheGlueCallable.java:14)
 at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2V2.execute(GridClosureProcessor.java:2004)
 at 
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
 at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6484)
 at 
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
 at 
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 at 
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1167)
 at 
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1772)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1058)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:836)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:104)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:799)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

?Thank you,
Nicu



Met vriendelijke groeten/Meilleures salutations/Best regards

Nicolae Marasoiu
Agile Developer
E [email protected]<mailto:[email protected]>

[http://signature.cegeka.com/mailsignature_cgk.png]
CEGEKA * 15-17 Ion Mihalache Blvd. Tower Center Building, 4th,5th,6th fl * 
RO-011171 Bucharest (RO) * T +40 21 336 20 65 * www.cegeka.com
Volg Cegeka: [http://signature.cegeka.com/twitter.png] 
<http://www.twitter.com/cegeka>  [http://signature.cegeka.com/linkedin.png] 
<http://www.linkedin.com/company/cegeka>  
[http://signature.cegeka.com/facebook.png] <http://www.facebook.com/Cegeka>  
[http://signature.cegeka.com/google-plus.png] <http://www.cegeka.be/googleplus>

Reply via email to