When a node joins the cluster needs to perform partition map exchange process. 
If this process takes too long, the cluster may become unresponsive.
Looks like this is what happened in your case. You can check how long an 
exchange took by looking for “Started exchange” and “Finished exchange” in the 
logs – I assume it’s around 20 seconds.

Debugging hanged partition map exchange issues may be pretty tricky.
My best guess so far is that the reduced timeouts you’ve set resulted in failed 
network IO (e.g. instead of waiting for a message for 10s and getting it on the 
first try, you retry every 3s until a fast enough delivery happens – which 
might be the 10th or 20th attempt).
Try changing timeouts back and see how long your exchanges take on a node join. 
Perhaps some value will be low enough to detect node failures and high enough 
to allow regular operations to pass. 
If that doesn’t help, please share the full logs from all nodes.

Thanks,
Stan 

From: Olexandr K
Sent: 11 июня 2018 г. 14:24
To: [email protected]
Subject: Re: Cache operations hanging for a minute when one of server 
nodesgoesdown

Hi Stan,

I tried to decrease network/failure timeouts and it worked fine when node 
stopped.
Unfortunately I got lot's of hanged calls when it started again.
At that time all cache calls got stuck for 25-30 seconds.
Is it expected? I thought rebalancing should occur in background and node 
should join the cluster when it is 100% ready, no?
See some log extracts below.

        <property name="networkSendRetryCount" value="1" /> <!-- default: 3 -->
        <property name="networkSendRetryDelay" value="500" /> <!-- default: 
1000 -->
        <property name="networkTimeout" value="1500" /> <!-- default: 5000 -->
        <property name="failureDetectionTimeout" value="3000" /> <!-- default: 
10000 -->
        <property name="clientFailureDetectionTimeout" value="10000" /> <!-- 
default: 30000 -->


-- CACHE CALL STARTED: cache.put()

83837380032 2018-06-11 13:51:54.777 [https-jsse-nio-8080-exec-4] INFO  
com.xxx.lk.backend.cache.impl.RefreshTokenCache - Store: 
d289a0a3-bca6-49a4-ae9d-9568517d656e

-- MANY WARNINGS BEFORE CALL COMPLETED

2018-06-11 13:51:55.058 [grid-timeout-worker-#4] DEBUG 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor - Timeout 
has occurred [obj=CancelableTask 
[id=fe9e57ee361-ba3899f9-a10b-4498-98e6-ed3b65dfc3f8, endTime=1528714315053, 
period=3000, cancel=false, 
task=org.apache.ignite.internal.processors.query.GridQueryProcessor$2@2649758c],
 process=true]
2018-06-11 13:51:55.355 [grid-timeout-worker-#4] DEBUG 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor - Timeout 
has occurred [obj=CancelableTask 
[id=ee9e57ee361-ba3899f9-a10b-4498-98e6-ed3b65dfc3f8, endTime=1528714315351, 
period=2000, cancel=false, 
task=org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$13@110ac52b],
 process=true]
2018-06-11 13:51:56.339 [nio-acceptor-#5] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Balancing data 
[min0=0, minIdx=0, max0=-1, maxIdx=-1]

2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Failed to wait for partition map 
exchange [topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], 
node=ba482197-64cc-4d84-81f7-2b58f0c66a0c]. Dumping pending objects that might 
be the cause: 
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Ready affinity version: 
AffinityTopologyVersion [topVer=5, minorTopVer=0]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Last exchange future: 
GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent 
[evtNode=TcpDiscoveryNode [id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, 
addrs=[10.2.0.163, 127.0.0.1, 30.251.106.199], 
sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, /127.0.0.1:47500, 
/30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313636, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=ba482197, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313636, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714314058], crd=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], exchId=GridDhtPartitionExchangeId 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], 
discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313636, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=ba482197, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313636, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714314058], nodeId=f5fbdfd8, 
evt=NODE_JOINED], added=true, initFut=GridFutureAdapter 
[ignoreInterrupts=false, state=DONE, res=true, hash=538678914], init=true, 
lastVer=null, partReleaseFut=null, exchActions=null, affChangeMsg=null, 
initTs=1528714314058, centralizedAff=false, changeGlobalStateE=null, 
done=false, state=CLIENT, evtLatch=0, 
remaining=[f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, 
34dc1ff0-45ec-4fee-995b-84dde8a451cb], super=GridFutureAdapter 
[ignoreInterrupts=false, state=INIT, res=null, hash=1716673894]]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager - 
First 10 pending exchange futures [total=0]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Last 10 exchange futures (total: 3):
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - >>> GridDhtPartitionsExchangeFuture 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], evt=NODE_JOINED, 
evtNode=TcpDiscoveryNode [id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, 
addrs=[10.2.0.163, 127.0.0.1, 30.251.106.199], 
sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, /127.0.0.1:47500, 
/30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313636, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], done=false]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - >>> GridDhtPartitionsExchangeFuture 
[topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], evt=NODE_LEFT, 
evtNode=TcpDiscoveryNode [id=c7e39052-0882-42f3-91aa-6a438c2ab745, 
addrs=[10.2.0.163, 127.0.0.1, 30.251.106.199], sockAddrs=[/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=1, intOrder=1, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], done=true]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - >>> GridDhtPartitionsExchangeFuture 
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], evt=NODE_JOINED, 
evtNode=TcpDiscoveryNode [id=ba482197-64cc-4d84-81f7-2b58f0c66a0c, 
addrs=[10.2.0.250, 127.0.0.1, 30.251.106.11], 
sockAddrs=[v-hp-lk-jbn02.9801.1678.ecs.hp.com/30.251.106.11:0, /10.2.0.250:0, 
/127.0.0.1:0], discPort=0, order=4, intOrder=0, lastExchangeTime=1528714098847, 
loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], done=true]
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending transactions:
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending explicit locks:
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending cache futures:
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending atomic cache futures:
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending data streamer futures:
 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN  
org.apache.ignite.internal.diagnostic - Pending transaction deadlock detection 
futures:
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Encrypted data 
[status=OK, handshakeStaus=NOT_HANDSHAKING, ses=GridSelectorNioSessionImpl 
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, 
bytesRcvd=11210, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, 
super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, 
finished=false, hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=853 lim=853 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=95956, bytesRcvd=11210, 
bytesSent0=0, bytesRcvd0=0, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714313367, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=false]]]
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Unwrapping 
received data: GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
[super=AbstractNioClientWorker [idx=1, bytesRcvd=17050, bytesSent=0, 
bytesRcvd0=5840, bytesSent0=0, select=true, super=GridWorker 
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, 
hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=5840 lim=5840 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=17050, 
bytesSent0=933, bytesRcvd0=5840, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], accepted=false]]
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Unwrapped raw 
data [status=BUFFER_UNDERFLOW, handshakeStatus=NOT_HANDSHAKING, 
ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
[super=AbstractNioClientWorker [idx=1, bytesRcvd=17050, bytesSent=0, 
bytesRcvd0=5840, bytesSent0=0, select=true, super=GridWorker 
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, 
hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=5840 lim=5840 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=17050, 
bytesSent0=933, bytesRcvd0=5840, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=false]]]
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Unwrapping 
received data: GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
[super=AbstractNioClientWorker [idx=1, bytesRcvd=20719, bytesSent=0, 
bytesRcvd0=9509, bytesSent0=0, select=true, super=GridWorker 
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, 
hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=3669 lim=3669 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=20719, 
bytesSent0=933, bytesRcvd0=9509, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], accepted=false]]
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Unwrapped raw 
data [status=OK, handshakeStatus=NOT_HANDSHAKING, 
ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
[super=AbstractNioClientWorker [idx=1, bytesRcvd=20719, bytesSent=0, 
bytesRcvd0=9509, bytesSent0=0, select=true, super=GridWorker 
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, 
hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=3669 lim=3669 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=20719, 
bytesSent0=933, bytesRcvd0=9509, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=false]]]
 2018-06-11 13:51:57.074 [grid-nio-worker-tcp-comm-1-#7] DEBUG 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi - Unwrapped raw 
data [status=BUFFER_UNDERFLOW, handshakeStatus=NOT_HANDSHAKING, 
ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
[super=AbstractNioClientWorker [idx=1, bytesRcvd=20719, bytesSent=0, 
bytesRcvd0=9509, bytesSent0=0, select=true, super=GridWorker 
[name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, finished=false, 
hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=3669 lim=3669 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=4, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=4, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=20719, 
bytesSent0=933, bytesRcvd0=9509, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=false]]]
 2018-06-11 13:51:57.089 [sys-#25] INFO  org.apache.ignite.internal.diagnostic 
- Exchange future waiting for coordinator response 
[crd=34dc1ff0-45ec-4fee-995b-84dde8a451cb, topVer=AffinityTopologyVersion 
[topVer=6, minorTopVer=0]]
Remote node information:
General node info [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, client=false, 
discoTopVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], 
time=13:51:56.992]
Partitions exchange info [readyVer=AffinityTopologyVersion [topVer=5, 
minorTopVer=0]]
Last initialized exchange future: GridDhtPartitionsExchangeFuture 
[firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=34dc1ff0, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714313966], crd=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
v-hp-lk-dcn02.9801.1678.ecs.hp.com/30.251.106.90:47500, /10.2.0.252:47500], 
discPort=47500, order=2, intOrder=2, lastExchangeTime=1528714316726, loc=true, 
ver=2.4.0#20180305-sha1:aa342270, isClient=false], 
exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=34dc1ff0, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714313966], nodeId=f5fbdfd8, 
evt=NODE_JOINED], added=true, initFut=GridFutureAdapter 
[ignoreInterrupts=false, state=DONE, res=true, hash=1994546166], init=true, 
lastVer=null, partReleaseFut=PartitionReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], 
futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], futures=[]], TxReleaseFuture [topVer=AffinityTopologyVersion 
[topVer=6, minorTopVer=0], futures=[]], AtomicUpdateReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], futures=[]], 
DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], futures=[]]]], exchActions=null, affChangeMsg=null, 
initTs=1528714313966, centralizedAff=false, changeGlobalStateE=null, 
done=false, state=CRD, evtLatch=0, 
remaining=[f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f], super=GridFutureAdapter 
[ignoreInterrupts=false, state=INIT, res=null, hash=715729116]]
Communication SPI statistics [rmtNode=ba482197-64cc-4d84-81f7-2b58f0c66a0c]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=ba482197-64cc-4d84-81f7-2b58f0c66a0c, idx=0, 
connCnt=0], msgsSent=4, msgsAckedByRmt=0, msgsRcvd=453, lastAcked=448, 
reserveCnt=1, descIdHash=1249269712]
Communication SPI clients: 
    [node=ba482197-64cc-4d84-81f7-2b58f0c66a0c, 
client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl 
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=2, 
bytesRcvd=98649, bytesSent=0, bytesRcvd0=933, bytesSent0=0, select=true, 
super=GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=null, 
finished=false, hashCode=989360520, interrupted=false, 
runner=grid-nio-worker-tcp-comm-2-#27]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=453, 
sentCnt=4, reserved=true, lastAck=448, nodeLeft=false, node=TcpDiscoveryNode 
[id=ba482197-64cc-4d84-81f7-2b58f0c66a0c, addrs=[10.2.0.250, 127.0.0.1, 
30.251.106.11], sockAddrs=[/10.2.0.250:0, /30.251.106.11:0, /127.0.0.1:0], 
discPort=0, order=4, intOrder=4, lastExchangeTime=1528714105712, loc=false, 
ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, 
queueLimit=4096, reserveCnt=1, pairedConnections=false], 
outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=453, 
sentCnt=4, reserved=true, lastAck=448, nodeLeft=false, node=TcpDiscoveryNode 
[id=ba482197-64cc-4d84-81f7-2b58f0c66a0c, addrs=[10.2.0.250, 127.0.0.1, 
30.251.106.11], sockAddrs=[/10.2.0.250:0, /30.251.106.11:0, /127.0.0.1:0], 
discPort=0, order=4, intOrder=4, lastExchangeTime=1528714105712, loc=false, 
ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, 
queueLimit=4096, reserveCnt=1, pairedConnections=false], 
super=GridNioSessionImpl [locAddr=/10.2.0.252:47100, rmtAddr=/10.2.0.250:49581, 
createTime=1528714107367, closeTime=0, bytesSent=12857, bytesRcvd=98649, 
bytesSent0=0, bytesRcvd0=933, sndSchedTime=1528714107367, 
lastSndTime=1528714313279, lastRcvTime=1528714316992, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@5f146306, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=true]], super=GridAbstractCommunicationClient [lastUsed=1528714107415, 
closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=2, keysCnt=1, bytesRcvd=98649, bytesRcvd0=933, 
>> bytesSent=0, bytesSent0=0]
    Connection info [in=true, rmtAddr=/10.2.0.250:49581, 
locAddr=/10.2.0.252:47100, msgsSent=4, msgsAckedByRmt=0, descIdHash=1249269712, 
unackedMsgs=[GridNearSingleGetResponse, GridNearSingleGetResponse, 
GridDhtPartitionsFullMessage, GridDhtPartitionsFullMessage], msgsRcvd=453, 
lastAcked=448, descIdHash=1249269712, bytesRcvd=98649, bytesRcvd0=933, 
bytesSent=12857, bytesSent0=0, opQueueSize=0]
Exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent 
[evtNode=TcpDiscoveryNode [id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, 
addrs=[10.2.0.163, 127.0.0.1, 30.251.106.199], 
sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, /127.0.0.1:47500, 
/30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=34dc1ff0, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714313966], crd=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
v-hp-lk-dcn02.9801.1678.ecs.hp.com/30.251.106.90:47500, /10.2.0.252:47500], 
discPort=47500, order=2, intOrder=2, lastExchangeTime=1528714316726, loc=true, 
ver=2.4.0#20180305-sha1:aa342270, isClient=false], 
exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], topVer=6, nodeId8=34dc1ff0, msg=Node joined: TcpDiscoveryNode 
[id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1, 
30.251.106.199], sockAddrs=[v-hp-lk-dcn01.xxxgroup.tek.loc/10.2.0.163:47500, 
/127.0.0.1:47500, /30.251.106.199:47500], discPort=47500, order=6, intOrder=5, 
lastExchangeTime=1528714313497, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], type=NODE_JOINED, tstamp=1528714313966], nodeId=f5fbdfd8, 
evt=NODE_JOINED], added=true, initFut=GridFutureAdapter 
[ignoreInterrupts=false, state=DONE, res=true, hash=1994546166], init=true, 
lastVer=null, partReleaseFut=PartitionReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], 
futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], futures=[]], TxReleaseFuture [topVer=AffinityTopologyVersion 
[topVer=6, minorTopVer=0], futures=[]], AtomicUpdateReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], futures=[]], 
DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=6, 
minorTopVer=0], futures=[]]]], exchActions=null, affChangeMsg=null, 
initTs=1528714313966, centralizedAff=false, changeGlobalStateE=null, 
done=false, state=CRD, evtLatch=0, 
remaining=[f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f], super=GridFutureAdapter 
[ignoreInterrupts=false, state=INIT, res=null, hash=715729116]]
Local communication statistics:
Communication SPI statistics [rmtNode=34dc1ff0-45ec-4fee-995b-84dde8a451cb]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=34dc1ff0-45ec-4fee-995b-84dde8a451cb, idx=0, 
connCnt=-1], msgsSent=453, msgsAckedByRmt=448, msgsRcvd=5, lastAcked=0, 
reserveCnt=1, descIdHash=105307846]
Communication SPI clients: 
    [node=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl 
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, 
bytesRcvd=20719, bytesSent=0, bytesRcvd0=9509, bytesSent0=0, select=true, 
super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=null, 
finished=false, hashCode=1835849342, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#7]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
inRecovery=GridNioRecoveryDescriptor [acked=448, resendCnt=0, rcvCnt=5, 
sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, addrs=[10.2.0.252, 127.0.0.1, 
30.251.106.90], sockAddrs=[/127.0.0.1:47500, /10.2.0.252:47500, 
/30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=448, 
resendCnt=0, rcvCnt=5, sentCnt=453, reserved=true, lastAck=0, nodeLeft=false, 
node=TcpDiscoveryNode [id=34dc1ff0-45ec-4fee-995b-84dde8a451cb, 
addrs=[10.2.0.252, 127.0.0.1, 30.251.106.90], sockAddrs=[/127.0.0.1:47500, 
/10.2.0.252:47500, /30.251.106.90:47500], discPort=47500, order=2, intOrder=2, 
lastExchangeTime=1528714105823, loc=false, ver=2.4.0#20180305-sha1:aa342270, 
isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, 
pairedConnections=false], super=GridNioSessionImpl [locAddr=/10.2.0.250:49581, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
createTime=1528714107495, closeTime=0, bytesSent=96889, bytesRcvd=20719, 
bytesSent0=933, bytesRcvd0=9509, sndSchedTime=1528714107495, 
lastSndTime=1528714317074, lastRcvTime=1528714317074, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@51afeee5, 
directMode=true], GridConnectionBytesVerifyFilter, SSL filter], 
accepted=false]], super=GridAbstractCommunicationClient 
[lastUsed=1528714107495, closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=1, keysCnt=1, bytesRcvd=20719, bytesRcvd0=9509, 
>> bytesSent=0, bytesSent0=0]
    Connection info [in=false, 
rmtAddr=v-hp-lk-dcn02.xxxgroup.tek.loc/10.2.0.252:47100, 
locAddr=/10.2.0.250:49581, msgsSent=453, msgsAckedByRmt=448, 
descIdHash=105307846, unackedMsgs=[GridNearAtomicFullUpdateRequest, 
GridNearAtomicFullUpdateRequest, GridNearAtomicFullUpdateRequest, 
GridDhtPartitionsSingleMessage, IgniteDiagnosticMessage], msgsRcvd=5, 
lastAcked=0, descIdHash=105307846, bytesRcvd=20719, bytesRcvd0=9509, 
bytesSent=96889, bytesSent0=933, opQueueSize=0]

2018-06-11 13:51:57.370 [grid-timeout-worker-#4] DEBUG 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor - Timeout 
has occurred [obj=CancelableTask 
[id=ee9e57ee361-ba3899f9-a10b-4498-98e6-ed3b65dfc3f8, endTime=1528714317355, 
period=2000, cancel=false, 
task=org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$13@110ac52b],
 process=true]
-- CALL COMPLETED

83837380032 2018-06-11 13:52:21.903 [https-jsse-nio-8080-exec-4] DEBUG 
com.xxx.lk.backend.cache.impl.RefreshTokenCache - Cache operation completed

On Mon, Jun 11, 2018 at 12:27 PM, Stanislav Lukyanov <[email protected]> 
wrote:
Hi,
 
Try to configure the existing timeouts:
- failureDetectionTimeout
- clientFailureDetectionTimeout
- networkTimeout
 
Not sure if the Future you mention really needs timeout. If there was a 
timeout, it would be either failureDetectionTimeout or networkTimeout anyway.
Currently the calling thread has to wait for the failed node to be kicked out 
of the cluster. The time that the cluster needs to kick out a node depends 
(primarily) on the failureDetectionTimeout, although isn’t strictly equal to it.
 
The problem I can see here is that the putAsync is not actually async since it 
performs substantial work in the calling thread. I’ll give it another look and 
file a bug.
For now, as a workaround (aside from configuring the timeouts) I’d suggest to 
use an ExecutorService to call put()/putAsync(), getting a cancelable Future 
from the start.
 
Thanks,
Stan
 
From: Olexandr K
Sent: 11 июня 2018 г. 2:44
To: [email protected]
Subject: Cache operations hanging for a minute when one of server nodes goesdown
 
Hi Igniters,
 
I'm testing our system for availability.
It uses Ignite as key/value persistent cache.
 
Here is my test:
1) start 2 server and 2 client nodes
2) run heavy load on client nodes (some application logic which cause cache 
calls)
3) stop 1 server node
 
Here I expect all in-progress cache operations targeted to server 1 node to 
fail fast.
What I don't want is to hang all my processing threads for significant time. 
Unfortunately it works exactly that way: I'm constantly getting my threads 
blocked for 20-80 seconds. 
Finally putAsync() completes successfully but I'd prefer cache operation to 
fail fast. I don't want to hang all processing threads for a minute because of 
cache.
 
It works the same for put() and putAsync() calls.
 
As I see in the code, it can be fixed by calling future.get(timeout) instead of 
future.get() in TcpCommunicationSpi.
Timeout should be configurable.
 
TcpCommunicationSpi (line: 2799)
  private GridCommunicationClient reserveClient(ClusterNode node, int connIdx) {
    ...
    client = fut.get();
 
Does it make sense from your point of view?
 
Here is my thread  dump:
 
threads=[
  {
    threadName=https-jsse-nio-8080-exec-20,
    threadId=102,
    blockedTime=-1,
    blockedCount=0,
    waitedTime=-1,
    waitedCount=5,
    lockName=null,
    lockOwnerId=-1,
    lockOwnerName=null,
    inNative=false,
    suspended=false,
    threadState=WAITING,
    stackTrace=[
      {
        methodName=park,
        fileName=Unsafe.java,
        lineNumber=-2,
        className=sun.misc.Unsafe,
        nativeMethod=true
      },
      {
        methodName=park,
        fileName=LockSupport.java,
        lineNumber=304,
        className=java.util.concurrent.locks.LockSupport,
        nativeMethod=false
      },
      {
        methodName=get0,
        fileName=GridFutureAdapter.java,
        lineNumber=177,
        className=org.apache.ignite.internal.util.future.GridFutureAdapter,
        nativeMethod=false
      },
      {
        methodName=get,
        fileName=GridFutureAdapter.java,
        lineNumber=140,
        className=org.apache.ignite.internal.util.future.GridFutureAdapter,
        nativeMethod=false
      },
      {
        methodName=reserveClient,
        fileName=TcpCommunicationSpi.java,
        lineNumber=2799,
        className=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi,
        nativeMethod=false
      },
....
      {
        methodName=putAsync,
        fileName=IgniteCacheProxyImpl.java,
        lineNumber=1035,
        
className=org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl,
        nativeMethod=false
      },
      {
        methodName=putAsync,
        fileName=GatewayProtectedCacheProxy.java,
        lineNumber=900,
        
className=org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy,
        nativeMethod=false
      },
 
Sample cache config:
 
<bean class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="RefreshToken"/>
                    <property name="dataRegionName" 
value="auth_durable_region"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="writeSynchronizationMode" 
value="FULL_ASYNC"/>
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="backups" value="1"/>
                    <property name="eagerTtl" value="true"/> 
                </bean>
 
Ignite version: 2.4.0
OS: Windows Server 2012 R2
 
BR, Oleksandr
 


Reply via email to