Re: Ignite 2.8.1 - Remote node does not observe current node in topology

Ilya Kasnacheev Wed, 10 Feb 2021 02:12:36 -0800

Hello!

These two nodes were removed from cluster due to inability to submit
metrics in time:
[15:07:16,330][WARNING][tcp-disco-msg-worker-[11cf0c06 10.212.120.71:57500
crd]-#2%hh_DynamicGrid_v2%][TcpDiscoverySpi] Failing client node due to not
receiving metrics updates from client node within
'IgniteConfiguration.clientFailureDetectionTimeout' (consider increasing
configuration property) [timeout=120000, node=TcpDiscoveryNode
[id=9dbcfb86-a60e-4382-904f-57bffbe18c5c,consistentId=73B5811B-9644-48FD-A533-B4609FDAD591,
addrs=ArrayList [10.212.120.190], sockAddrs=HashSet [
VWNV02AX07080.HH.com/10.212.120.190:0], discPort=0, order=488,
intOrder=248, lastExchangeTime=1612397142960, loc=false,
ver=2.8.1#20200521-sha1:86422096, isClient=true]]
[15:07:16,331][WARNING][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=9dbcfb86-a60e-4382-904f-57bffbe18c5c,
consistentId=73B5811B-9644-48FD-A533-B4609FDAD591, addrs=ArrayList
[10.212.120.190], sockAddrs=HashSet [VWNV02AX07080.HH.com/10.212.120.190:0],
discPort=0, order=488, intOrder=248, lastExchangeTime=1612397142960,
loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=true]
[15:07:16,332][INFO][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
Topology snapshot [ver=499, locNode=83fd7c70, servers=3, clients=14,
state=ACTIVE, CPUs=204, offheap=54.0GB, heap=120.0GB]
[15:07:16,332][INFO][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
  ^-- Baseline [id=0, size=3, online=3, offline=0]
[15:07:16,332][WARNING][tcp-disco-msg-worker-[11cf0c06 10.212.120.71:57500
crd]-#2%hh_DynamicGrid_v2%][TcpDiscoverySpi] Failing client node due to not
receiving metrics updates from client node within
'IgniteConfiguration.clientFailureDetectionTimeout' (consider increasing
configuration property) [timeout=120000, node=TcpDiscoveryNode [id=8d51aa56
-b67e-4d4c-a9ba-5c68699e6a47,consistentId=8FFFBE22-239A-4442-91FA-947EFE1207C0,
addrs=ArrayList [10.212.120.187], sockAddrs=HashSet [
VWNV02AX07077.HH.com/10.212.120.187:0], discPort=0, order=493,
intOrder=253, lastExchangeTime=1612397256739, loc=false,
ver=2.8.1#20200521-sha1:86422096, isClient=true]]
[15:07:16,333][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][time] Started
exchange init [topVer=AffinityTopologyVersion [topVer=499, minorTopVer=0],
crd=true, evt=NODE_FAILED, evtNode=9dbcfb86-a60e-4382-904f-57bffbe18c5c,
customEvt=null, allowMerge=true, exchangeFreeSwitch=false]
[15:07:16,334][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=499,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=499, minorTopVer=0],
err=null, rebalanced=true, wasRebalanced=true]
[15:07:16,336][WARNING][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=8d51aa56-b67e-4d4c-a9ba-5c68699e6a47,
consistentId=8FFFBE22-239A-4442-91FA-947EFE1207C0, addrs=ArrayList
[10.212.120.187], sockAddrs=HashSet [VWNV02AX07077.HH.com/10.212.120.187:0],
discPort=0, order=493, intOrder=253, lastExchangeTime=1612397256739,
loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=true]
[15:07:16,337][INFO][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
Topology snapshot [ver=500, locNode=83fd7c70, servers=3, clients=13,
state=ACTIVE, CPUs=192, offheap=54.0GB, heap=120.0GB]
[15:07:16,337][INFO][disco-event-worker-#62%hh_DynamicGrid_v2%][GridDiscoveryManager]
  ^-- Baseline [id=0, size=3, online=3, offline=0]
[15:07:16,341][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Completed partition exchange
[localNode=83fd7c70-839d-46ca-969f-bbb9661d6ab2,
exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=499, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode
[id=9dbcfb86-a60e-4382-904f-57bffbe18c5c,
consistentId=73B5811B-9644-48FD-A533-B4609FDAD591, addrs=ArrayList
[10.212.120.190], sockAddrs=HashSet [VWNV02AX07080.HH.com/10.212.120.190:0],
discPort=0, order=488, intOrder=248, lastExchangeTime=1612397142960,
loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=true], done=true,
newCrdFut=null], topVer=AffinityTopologyVersion [topVer=499,
minorTopVer=0]]
[15:07:16,341][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Exchange timings [startVer=AffinityTopologyVersion [topVer=499,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=499, minorTopVer=0],
stage="Waiting in exchange queue" (0 ms), stage="Exchange parameters
initialization" (0 ms), stage="Determine exchange type" (1 ms),
stage="Exchange done" (6 ms), stage="Total time" (7 ms)]
[15:07:16,341][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Exchange longest local stages [startVer=AffinityTopologyVersion
[topVer=499, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=499,
minorTopVer=0]]
[15:07:16,341][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][time] Finished
exchange init [topVer=AffinityTopologyVersion [topVer=499, minorTopVer=0],
crd=true]
[15:07:16,363][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=499, minorTopVer=0], force=false, evt=NODE_FAILED,
node=9dbcfb86-a60e-4382-904f-57bffbe18c5c]
[15:07:16,363][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][time] Started
exchange init [topVer=AffinityTopologyVersion [topVer=500, minorTopVer=0],
crd=true, evt=NODE_FAILED, evtNode=8d51aa56-b67e-4d4c-a9ba-5c68699e6a47,
customEvt=null, allowMerge=true, exchangeFreeSwitch=false]
[15:07:16,364][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=500,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=500, minorTopVer=0],
err=null, rebalanced=true, wasRebalanced=true]
[15:07:16,372][INFO][exchange-worker-#63%hh_DynamicGrid_v2%][GridDhtPartitionsExchangeFuture]
Completed partition exchange
[localNode=83fd7c70-839d-46ca-969f-bbb9661d6ab2,
exchange=GridDhtPartitionsExchangeFuture
[topVer=AffinityTopologyVersion [topVer=500, minorTopVer=0],
evt=NODE_FAILED, evtNode=TcpDiscoveryNode
[id=8d51aa56-b67e-4d4c-a9ba-5c68699e6a47,
consistentId=8FFFBE22-239A-4442-91FA-947EFE1207C0, addrs=ArrayList
[10.212.120.187], sockAddrs=HashSet [VWNV02AX07077.HH.com/10.212.120.187:0],
discPort=0, order=493, intOrder=253, lastExchangeTime=1612397256739,
loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=true], done=true,
newCrdFut=null], topVer=AffinityTopologyVersion [topVer=500, minorTopVer=0]]


I'm not sure why these nodes were not able to detect that they are no
longer in the cluster. Perhaps some network issues could cause it. Normally
this should not happen, client nodes should detect this and try to
reconnect.

Regards,
-- 
Ilya Kasnacheev


ср, 10 февр. 2021 г. в 09:17, Charlin S <[email protected]>:

> Hi,
> Please find log files here as requested.
> Note: Client node connected with two different grids(Replicated Mode and
> Partition Mode). We are facing issues with partition mode grid.
>
> Thanks & Regards,
> Charlin
>
>
> On Fri, 5 Feb 2021 at 16:18, Ilya Kasnacheev <[email protected]>
> wrote:
>
>> Hello!
>>
>> Please share a complete log from both nodes. I expect that one of the
>> nodes were previously segmented.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пт, 5 февр. 2021 г. в 12:41, Charlin S <[email protected]>:
>>
>>> Hi,
>>>
>>> i'm running an ASP.Net application with ignite 2.8.1 and seeing below
>>> error details in ignite client log and my web site stopped working.
>>> It's back to normal after restarting the application pool.
>>>
>>> [15:07:42,689][SEVERE][Thread-219][TcpCommunicationSpi] Failed to send
>>> message to remote node [node=TcpDiscoveryNode
>>> [id=83fd7c70-839d-46ca-969f-bbb9661d6ab2, consistentId=127.1.1.1:57500,
>>> addrs=ArrayList [127.1.1.1], sockAddrs=HashSet [test.com/127.1.1.1:57500],
>>> discPort=57500, order=1, intOrder=1, lastExchangeTime=1612397256785,
>>> loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=false],
>>> msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
>>> timeout=0, skipOnTimeout=false, msg=GridNearAtomicFullUpdateRequest
>>> [keys=ArrayList [UserKeyCacheObjectImpl [part=292,
>>> val=TestModel:TEST|bbf4da4d-c3d7-4b46-98b6-0de70c30f668,
>>> hasValBytes=true]], conflictTtls=null, conflictExpireTimes=null,
>>> expiryPlc=org.apache.ignite.internal.processors.platform.cache.expiry.PlatformExpiryPolicy@3fb1b76e,
>>> initSize=1, filter=null, parent=GridNearAtomicAbstractUpdateRequest
>>> [res=null, flags=keepBinary]]]]
>>> class
>>> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Remote
>>> node does not observe current node in topology :
>>> 83fd7c70-839d-46ca-969f-bbb9661d6ab2
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3622)
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458)
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198)
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078)
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918)
>>> at
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877)
>>> at
>>> org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035)
>>> at
>>> org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132)
>>> at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257)
>>> at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:312)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:486)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:446)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:249)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1164)
>>> at
>>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:624)
>>> at
>>> org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2580)
>>> at
>>> org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2557)
>>> at
>>> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1299)
>>> at
>>> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:856)
>>> at
>>> org.apache.ignite.internal.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:432)
>>> at
>>> org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:67)
>>>
>>> Its working fine after restarted the website.
>>>
>>> Thanks & Regards,
>>> Charlin
>>>
>>

Re: Ignite 2.8.1 - Remote node does not observe current node in topology

Reply via email to