Kirill Tkalenko created IGNITE-12392: ----------------------------------------
Summary: Faster transaction rolled back when one of backup node failed Key: IGNITE-12392 URL: https://issues.apache.org/jira/browse/IGNITE-12392 Project: Ignite Issue Type: Improvement Reporter: Kirill Tkalenko Assignee: Kirill Tkalenko Fix For: 2.8 In case of massive prepared transactions roll back, when node fail, have a linearizable behavior: {noformat}2019-09-26 18:48:21.034[ERROR][sys-stripe-16-#17%DPL_GRID%DplGridNodeName%[o.a.i.s.c.tcp.TcpCommunicationSpi] Failed to send message to remote node [node=TcpDiscoveryNode [id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201], sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500, order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false, ver=2.5.1#20190327-sha1:6edfea1b, isClient=false], msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridCacheIdMessage [cacheId=0]GridDistributedBaseMessage [ver=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1], committedVers=EmptyList [], rolledbackVers=EmptyList [], cnt=0, super=]GridDistributedTxFinishRequest [topVer=AffinityTopologyVersion [topVer=524, minorTopVer=2], futId=fb44a686e61-9a074a8c-dca4-4444-84fe-e9a93818fbd2, threadId=2098, commitVer=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1], org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to send message (node left topology): TcpDiscoveryNode [id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201], sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500, order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false, ver=2.5.1#20190327-sha1:6edfea1b, isClient=false] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3276) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2998) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2878) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2721) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2680) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1643) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1715) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1177) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:462) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:291) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:495) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.rollbackDhtLocalAsync(GridDhtTxLocal.java:571) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1005) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:876) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:832) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:101) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:193) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748)\{noformat} It will exacerbate _exchange init_, because is happening waiting transaction completions. We should not send request to failed node and was not fall deeply in stack. -- This message was sent by Atlassian Jira (v8.3.4#803005)