Hello!

What's the period of time?

When client disconnects, topology will change, which will trigger waiting
for PME, which will delay all further operations until PME is finished.

Avoid having short-lived clients.

Regards,
-- 
Ilya Kasnacheev


вт, 23 апр. 2019 г. в 03:40, Matt Nohelty <[email protected]>:

> I already posted this question to stack overflow here
> https://stackoverflow.com/questions/55801760/what-happens-in-apache-ignite-when-a-client-gets-disconnected
> but this mailing list is probably more appropriate.
>
> We use Apache Ignite for caching and are seeing some unexpected behavior
> across all of the clients of cluster when one of the clients fails. The
> Ignite cluster itself has three servers and there are approximately 12
> servers connecting to that cluster as clients. The cluster has persistence
> disabled and many of the caches have near caching enabled.
>
> What we are seeing is that when one of the clients fail (out of memory,
> high CPU, network connectivity, etc.), threads on all the other clients
> block for a period of time. During these times, the Ignite servers
> themselves seem fine but I see things like the following in the logs:
>
> Topology snapshot [ver=123, servers=3, clients=11, CPUs=XXX, offheap=XX.XGB, 
> heap=XXX.GB]Topology snapshot [ver=124, servers=3, clients=10, CPUs=XXX, 
> offheap=XX.XGB, heap=XXX.GB]
>
> The topology itself is clearly changing when a client connects/disconnects
> but is there anything happening internally inside the cluster that could
> cause blocking on other clients? I would expect re-balancing of data when a
> server disconnects but not a client.
>
> From a thread dump, I see many threads stuck in the following state:
>
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)- parking to wait for  
> <0x000000078a86ff18> (a java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
> at org.apache.ignite.internal.util.IgniteUtils.await(IgniteUtils.java:7452)
> at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor.awaitAllReplies(GridReduceQueryExecutor.java:1056)
> at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:733)
> at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$8.iterator(IgniteH2Indexing.java:1339)
> at 
> org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
> at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$9.iterator(IgniteH2Indexing.java:1403)
> at 
> org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
> at java.lang.Iterable.forEach(Iterable.java:74)...
>
> Any ideas, suggestions, or further avenues to investigate would be much
> appreciated.
>

Reply via email to