Vaibhav Shrivastava created IGNITE-23475:
--------------------------------------------

             Summary: Ignite node gets stuck on the IgniteCache.remove method 
for a long time
                 Key: IGNITE-23475
                 URL: https://issues.apache.org/jira/browse/IGNITE-23475
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.15
            Reporter: Vaibhav Shrivastava


Hello Team
We are facing one issue with ignite 2.15 

There is a node lets say X in the ignite cluster which established the outgoing 
communication connection with ignite server node Y at 4:35:18 

As per the logs of X below

 
{noformat}
04:35:18:703657 | class 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {INFO} Action 
{Established outgoing communication connection [locAddr=/x.x.x.x:58454, 
rmtAddr=Y/y.y.y.y:48102]} Thread 
{grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%}{noformat}
Now at 05:46:18, X tried to update the Ignite Cache synchronously using 
IgniteCache.remove method

 

But X receive warnings after around 70 seconds of IgniteCache.remove method 

 
{noformat}
05:47:29:776422|0153-00271:Tracer_Ignite 
[org.apache.ignite.internal.diagnostic]: {WARN} Action {First 10 long running 
transactions [total=1]} Thread {sys-#103%WebCluster%}
05:47:29:776576|6956-00271:Tracer_Ignite 
[org.apache.ignite.internal.diagnostic]: {WARN} Action {>>> Transaction 
[startTime=05:46:18,514, curTime=05:47:29,763, systemTime=71248, userTime=1, 
tx=GridNearTxLocal [mappings=IgniteTxMappingsSingleImpl 
[mapping=GridDistributedTxMapping [entries=LinkedHashSet [IgniteTxEntry 
[txKey=IgniteTxKey [key=com.host1.ignite........... thread=<failed to find 
active thread 47> 

05:47:29:786789|0851-00269:Tracer_Ignite 
[org.apache.ignite.internal.diagnostic]: {INFO} Action 
{GridNearOptimisticTxPrepareFuture waiting for local keys lock 
[node=60cc9b4f-ca79-4124-89b3-0dc8caa5e7f4, topVer=AffinityTopologyVersion 
[topVer=13, minorTopVer=0], allKeysAdded=true, keys=GridConcurrentHashSet 
[elements=KeySetView [IgniteTxKey [key=com.X.caches.key [idHash=401413886, 
hash=-1793151138, sessionComponentName=temp_name], cacheId=1968825002]]]]

{noformat}
X continued to get these above warning 

 
{noformat}
05:49:29:773714|0152-00202:Tracer_Ignite 
[org.apache.ignite.internal.diagnostic]: {WARN} Action {First 10 long running 
transactions [total=1]} Thread {sys-#83%WebCluster%}{noformat}
These warnings are coming after every 2mins

 

 

 
{noformat}
05:55:29:805464|0883-00119:Tracer_Ignite 
[org.apache.ignite.internal.diagnostic]: {INFO} Action 
{GridNearOptimisticTxPrepareFuture waiting for local keys lock 
[node=60cc9b4f-ca79-4124-89b3-0dc8caa5e7f4, topVer=AffinityTopologyVersion 
[topVer=13, minorTopVer=0], allKeysAdded=true, keys=GridConcurrentHashSet 
[elements=KeySetView [IgniteTxKey [key=com.X.caches.key [idHash=401413886, 
hash=-1793151138], cacheId=1968825002]]]]{noformat}
 

 

At 06:02:10 on the X we receive the exception of connection timed out

 

 
{noformat}
06:02:10:333953|2921-00117:Tracer_Ignite [class 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {ERROR} Action 
{Failed to process selector key [ses=GridSelectorNioSessionImpl 
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, 
bytesRcvd=180, bytesSent=32428, bytesRcvd0=0, bytesSent0=0, select=true, 
super=GridWorker [name=grid-nio-worker-tcp-comm-1, 
igniteInstanceName=TcpCommunicationSpi, finished=false, 
heartbeatTs=1729054929329, hashCode=909604085, interrupted=false, 
runner=grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%]]], 
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=null, 
outRecovery=GridNioRecoveryDescriptor [acked=150, resendCnt=0, rcvCnt=0, 
sentCnt=151, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
[id=d20abb30-8118-4066-a13e-01ad3abc1548, consistentId=host1,127.0.0.1:48500, 
addrs=ArrayList [X, 127.0.0.1], sockAddrs=null, discPort=48500, order=1, 
intOrder=1, lastExchangeTime=1729035688365, loc=false, 
ver=2.15.0#20230425-sha1:f98f7f35, isClient=false], connected=false, 
connectCnt=2, queueLimit=4096, reserveCnt=2, pairedConnections=true], 
closeSocket=true, 
outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1,
 super=GridNioSessionImpl [locAddr=/X:58454, rmtAddr=Y/y.y.y.y:48102, 
createTime=1729049718699, closeTime=0, bytesSent=4431, bytesRcvd=20, 
bytesSent0=0, bytesRcvd0=0, sndSchedTime=1729054918122, 
lastSndTime=1729053978514, lastRcvTime=1729054918122, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioCodecFilter 
[parser=o.a.i.i.util.nio.GridDirectParser@6536b6c, directMode=true], 
GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]]]} 
Thread {grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%}
java.io.IOException: Connection timed out
    at java.base/sun.nio.ch.SocketDispatcher.read0(Native Method)
    at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
    at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:330)
    at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:284)
    at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:259)
    at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:417)
    at 
org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1351)
    at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2526)
    at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2281)
    at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
    at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
    at java.base/java.lang.Thread.run(Thread.java:840)

06:02:10:335005|0364-00117:Tracer_Ignite [class 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi]: {WARN} Action 
{Client disconnected abruptly due to network connection loss or because the 
connection was left open on application shutdown. [cls=class 
o.a.i.i.util.nio.GridNioException, msg=Connection timed out]} Thread 
{grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%} {noformat}
 

At ignite server side we received the connection closed error


{noformat}
06:02:10,377181 [53] DEBUG TcpCommunicationSpi(Ignite) - Session was closed but 
there are unacknowledged messages, will try to reconnect 
[rmtNode=60cc9b4f-ca79-4124-89b3-0dc8caa5e7f4]{noformat}

After this timeout a new outgoing connection got stablished and things started 
working fine 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to