Kirill Tkalenko created IGNITE-12392:
----------------------------------------
Summary: Faster transaction rolled back when one of backup node
failed
Key: IGNITE-12392
URL: https://issues.apache.org/jira/browse/IGNITE-12392
Project: Ignite
Issue Type: Improvement
Reporter: Kirill Tkalenko
Assignee: Kirill Tkalenko
Fix For: 2.8
In case of massive prepared transactions roll back, when node fail, have a
linearizable behavior:
{noformat}2019-09-26
18:48:21.034[ERROR][sys-stripe-16-#17%DPL_GRID%DplGridNodeName%[o.a.i.s.c.tcp.TcpCommunicationSpi]
Failed to send message to remote node [node=TcpDiscoveryNode
[id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201],
sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500,
order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false,
ver=2.5.1#20190327-sha1:6edfea1b, isClient=false], msg=GridIoMessage [plc=2,
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridCacheIdMessage [cacheId=0]GridDistributedBaseMessage
[ver=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1],
committedVers=EmptyList [], rolledbackVers=EmptyList [], cnt=0,
super=]GridDistributedTxFinishRequest [topVer=AffinityTopologyVersion
[topVer=524, minorTopVer=2],
futId=fb44a686e61-9a074a8c-dca4-4444-84fe-e9a93818fbd2, threadId=2098,
commitVer=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1],
org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to
send message (node left topology): TcpDiscoveryNode
[id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201],
sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500,
order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false,
ver=2.5.1#20190327-sha1:6edfea1b, isClient=false]
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3276)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2998)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2878)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2721)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2680)
at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1643)
at
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1715)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1177)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:462)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:291)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:495)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.rollbackDhtLocalAsync(GridDhtTxLocal.java:571)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1005)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:876)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:832)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:101)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:193)
at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
at
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)\{noformat}
It will exacerbate _exchange init_, because is happening waiting transaction
completions.
We should not send request to failed node and was not fall deeply in stack.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)