Semyon Danilov created IGNITE-14448:
---------------------------------------
Summary: Failure to connect to node leads to hanging connection
future if paired connections are used
Key: IGNITE-14448
URL: https://issues.apache.org/jira/browse/IGNITE-14448
Project: Ignite
Issue Type: Bug
Components: networking
Affects Versions: 2.10
Reporter: Semyon Danilov
Assignee: Semyon Danilov
{{if ((CommunicationSpi<?>)spi instanceof TcpCommunicationSpi)
getTcpCommunicationSpi().setConnectionRequestor(invConnHandler);
if (connRequestor != null) {
...
if (isPairedConnection(node, tcpCommSpi))
throw new IgniteSpiException("Inverse connection protocol
doesn't support paired connections");}}
Turns out this exception is not handled property and connection future is never
done. Then, striped pool threads wait forever on reserveClient() and cluster
grinds to halt.
This happens in versions which have communication-via-discovery and when
usePairedConnections=true.
{{[12:06:18,110][SEVERE][sys-stripe-0-#1][TcpCommunicationSpi] Failed to send
message to remote node [node=TcpDiscoveryNode
[id=54ddcf8b-3e41-4efe-bb9d-8a0369e7b893, consistentId=54ddcf8b-3e4
1-4efe-bb9d-8a0369e7b893, addrs=ArrayList [127.0.0.1, 172.22.229.21],
sockAddrs=HashSet [/127.0.0.1:0,
ip-172-22-229-21.ec2.internal/172.22.229.21:0], discPort=0, order=47,
intOrder=47, lastExchangeTime=1603983940522, loc=false,
ver=8.7.25#20200910-sha1:b580d9fd, isClient=true], msg=GridIoMessage [plc=2,
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=f
alse, msg=GridDhtAtomicSingleUpdateRequest [key=KeyCacheObjectImpl [part=24,
val=23576, hasValBytes=true], val=com.dream11.ignite.model.GetRoundSummaryRes
[idHash=69226443, hash=580815760,roundId=23576, dataSource=MYSQL,
sparkJobStatus=COMPLETED], prevVal=null,
super=GridDhtAtomicAbstractUpdateRequest [onRes=false, nearNodeId=null,
nearFutId=0, flags=near]], connIdx=-1]]
class org.apache.ignite.spi.IgniteSpiException: Inverse connection protocol
doesn't support paired connections
at
org.apache.ignite.internal.managers.communication.GridIoManager$TcpCommunicationInverseConnectionHandler.request(GridIoManager.java:3564)
at
org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.handleUnreachableNodeException(ConnectionClientPool.java:365)
at
org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:256)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1132)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1083)
at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1814)
at
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1930)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)