Hi shahidv, There are definite signs of network issues between the cluster nodes. In order to find where the issue resides, could you please post all logs from server logs?
>Failed to send message to next node [msg=TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, addrs=[10.174.92.75, 127.0.0.1, 192.168.0.27], sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, /192.168.0.27:47500] >[15:54:48,234][WARNING][disco-event-worker-#42][GridDiscoveryManager] Node FAILED: TcpDiscoveryNode [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, addrs=[10.174.92.75, 127.0.0.1, 192.168.0.27], sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, /192.168.0.27:47500], discPort=47500, order=4, intOrder=3, lastExchangeTime=1562754278190, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=false] Do you have all ports mentioned in logs open in all directions? Are you able to ping through all involved addresses? ср, 10 июл. 2019 г. в 13:31, shahidv <[email protected]>: > node seems to be joined and failed , anyone please help, > > [15:54:48,224][WARNING][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Failed to > send message to next node [msg=TcpDiscoveryNodeAddedMessage > [node=TcpDiscoveryNode [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, > addrs=[10.174.92.75, 127.0.0.1, 192.168.0.27], > sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, /192.168.0.27:47500], > discPort=47500, order=0, intOrder=3, lastExchangeTime=1562754278190, > loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=false], > dataPacket=o.a.i.spi.discovery.tcp.internal.DiscoveryDataPacket@4f27e93, > discardMsgId=null, discardCustomMsgId=null, top=null, clientTop=null, > gridStartTime=1562753684554, super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=8f77e5bdb61-2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, > verifierNodeId=2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, topVer=0, > pendingIdx=0, > failedNodes=null, isClient=false]], next=TcpDiscoveryNode > [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, addrs=[10.174.92.75, 127.0.0.1, > 192.168.0.27], sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, > /192.168.0.27:47500], discPort=47500, order=0, intOrder=3, > lastExchangeTime=1562754278190, loc=false, > ver=2.7.5#20190603-sha1:be4f2a15, > isClient=false], errMsg=Failed to send message to next node > [msg=TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode > [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, addrs=[10.174.92.75, 127.0.0.1, > 192.168.0.27], sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, > /192.168.0.27:47500], discPort=47500, order=0, intOrder=3, > lastExchangeTime=1562754278190, loc=false, > ver=2.7.5#20190603-sha1:be4f2a15, > isClient=false], > dataPacket=o.a.i.spi.discovery.tcp.internal.DiscoveryDataPacket@4f27e93, > discardMsgId=null, discardCustomMsgId=null, top=null, clientTop=null, > gridStartTime=1562753684554, super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=8f77e5bdb61-2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, > verifierNodeId=2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, topVer=0, > pendingIdx=0, > failedNodes=null, isClient=false]], next=ClusterNode > [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, order=0, addr=[10.174.92.75, > 127.0.0.1, 192.168.0.27], daemon=false]]] > [15:54:48,227][INFO][disco-event-worker-#42][GridDiscoveryManager] Added > new > node to topology: TcpDiscoveryNode > [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, > addrs=[10.174.92.75, 127.0.0.1, 192.168.0.27], > sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, /192.168.0.27:47500], > discPort=47500, order=4, intOrder=3, lastExchangeTime=1562754278190, > loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=false] > [15:54:48,232][INFO][disco-event-worker-#42][GridDiscoveryManager] Topology > snapshot [ver=4, locNode=2e7c59eb, servers=2, clients=0, state=ACTIVE, > CPUs=16, offheap=8.0GB, heap=2.0GB] > [15:54:48,232][INFO][disco-event-worker-#42][GridDiscoveryManager] ^-- > Baseline [id=0, size=1, online=1, offline=0] > [15:54:48,234][INFO][exchange-worker-#43][time] Started exchange init > [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], > mvccCrd=MvccCoordinator [nodeId=2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, > crdVer=1562753684555, topVer=AffinityTopologyVersion [topVer=1, > minorTopVer=0]], mvccCrdChange=false, crd=true, evt=NODE_JOINED, > evtNode=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, customEvt=null, > allowMerge=true] > [15:54:48,234][WARNING][disco-event-worker-#42][GridDiscoveryManager] Node > FAILED: TcpDiscoveryNode [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, > addrs=[10.174.92.75, 127.0.0.1, 192.168.0.27], > sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, /192.168.0.27:47500], > discPort=47500, order=4, intOrder=3, lastExchangeTime=1562754278190, > loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=false] > [15:54:48,236][INFO][exchange-worker-#43][GridDhtPartitionsExchangeFuture] > Finished waiting for partition release future > [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], waitTime=0ms, > futInfo=NA, mode=DISTRIBUTED] > [15:54:48,238][INFO][exchange-worker-#43][GridDhtPartitionsExchangeFuture] > Finished waiting for partitions release latch: ServerLatch [permits=0, > pendingAcks=[], super=CompletableLatch [id=exchange, > topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0]]] > [15:54:48,238][INFO][disco-event-worker-#42][GridDiscoveryManager] Topology > snapshot [ver=5, locNode=2e7c59eb, servers=1, clients=0, state=ACTIVE, > CPUs=8, offheap=4.0GB, heap=1.0GB] > [15:54:48,238][INFO][exchange-worker-#43][GridDhtPartitionsExchangeFuture] > Finished waiting for partition release future > [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], waitTime=0ms, > futInfo=NA, mode=LOCAL] > [15:54:48,238][INFO][disco-event-worker-#42][GridDiscoveryManager] ^-- > Baseline [id=0, size=1, online=1, offline=0] > [15:54:48,239][INFO][exchange-worker-#43][GridCacheDatabaseSharedManager] > Logical recovery performed in 1 ms. > [15:54:48,239][INFO][exchange-worker-#43][time] Finished exchange init > [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], crd=true] > [15:54:48,245][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Coordinator > received all messages, try merge [ver=AffinityTopologyVersion [topVer=4, > minorTopVer=0]] > [15:54:48,246][INFO][sys-#113][GridCachePartitionExchangeManager] Merge > exchange future [curFut=AffinityTopologyVersion [topVer=4, minorTopVer=0], > mergedFut=AffinityTopologyVersion [topVer=5, minorTopVer=0], > evt=NODE_FAILED, evtNode=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, > evtNodeClient=false] > [15:54:48,246][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Exchanges > merging performed in 0 ms. > [15:54:48,246][INFO][sys-#113][GridDhtPartitionsExchangeFuture] > finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], resVer=AffinityTopologyVersion [topVer=5, minorTopVer=0]] > [15:54:48,248][INFO][sys-#113][CacheAffinitySharedManager] Affinity > recalculation (on server left) performed in 1 ms. > [15:54:48,250][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Affinity > changes (coordinator) applied in 3 ms. > [15:54:48,250][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Partitions > validation performed in 0 ms. > [15:54:48,252][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Partitions > assignment performed in 2 ms. > [15:54:48,252][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Detecting > lost partitions performed in 0 ms. > [15:54:48,258][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Preparing > Full Message performed in 5 ms. > [15:54:48,258][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Sending > Full > Message to all nodes performed in 0 ms. > [15:54:48,258][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], > resVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], err=null] > [15:54:48,259][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Detecting > lost partitions performed in 1 ms. > [15:54:48,260][INFO][sys-#113][GridDhtPartitionsExchangeFuture] Completed > partition exchange [localNode=2e7c59eb-7ed7-4f8c-8497-1e29b3a3c2a0, > exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion > [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode > [id=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82, addrs=[10.174.92.75, 127.0.0.1, > 192.168.0.27], sockAddrs=[/10.174.92.75:47500, /127.0.0.1:47500, > /192.168.0.27:47500], discPort=47500, order=4, intOrder=3, > lastExchangeTime=1562754278190, loc=false, > ver=2.7.5#20190603-sha1:be4f2a15, > isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=5, > minorTopVer=0], durationFromInit=22] > > [15:54:48,261][INFO][exchange-worker-#43][GridCachePartitionExchangeManager] > Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion > [topVer=5, minorTopVer=0], force=false, evt=NODE_JOINED, > node=bf63af3c-348f-4c8f-a9a9-83f1ccd08a82] > [15:55:44,745][INFO][grid-timeout-worker-#23][IgniteKernal] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=2e7c59eb, uptime=00:11:00.071] > ^-- H/N/C [hosts=1, nodes=1, CPUs=8] > ^-- CPU [cur=-100%, avg=-100%, GC=0%] > ^-- PageMemory [pages=15] > ^-- Heap [used=109MB, free=89.35%, comm=1024MB] > ^-- Off-heap [used=0MB, free=100%, comm=4396MB] > ^-- sysMemPlc region [used=0MB, free=99.99%, comm=100MB] > ^-- default region [used=0MB, free=100%, comm=4096MB] > ^-- metastoreMemPlc region [used=0MB, free=99.95%, comm=100MB] > ^-- TxLog region [used=0MB, free=100%, comm=100MB] > ^-- Ignite persistence [used=0MB] > ^-- sysMemPlc region [used=0MB] > ^-- default region [used=0MB] > ^-- metastoreMemPlc region [used=unknown] > ^-- TxLog region [used=0MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=2, qSize=0] > ^-- System thread pool [active=0, idle=8, qSize=0] > [15:56:44,753][INFO][grid-timeout-worker-#23][IgniteKernal] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=2e7c59eb, uptime=00:12:00.072] > ^-- H/N/C [hosts=1, nodes=1, CPUs=8] > ^-- CPU [cur=-100%, avg=-100%, GC=0%] > ^-- PageMemory [pages=15] > ^-- Heap [used=111MB, free=89.16%, comm=1024MB] > ^-- Off-heap [used=0MB, free=100%, comm=4396MB] > ^-- sysMemPlc region [used=0MB, free=99.99%, comm=100MB] > ^-- default region [used=0MB, free=100%, comm=4096MB] > ^-- metastoreMemPlc region [used=0MB, free=99.95%, comm=100MB] > ^-- TxLog region [used=0MB, free=100%, comm=100MB] > ^-- Ignite persistence [used=0MB] > ^-- sysMemPlc region [used=0MB] > ^-- default region [used=0MB] > ^-- metastoreMemPlc region [used=unknown] > ^-- TxLog region [used=0MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=6, qSize=0] > [15:57:44,756][INFO][grid-timeout-worker-#23][IgniteKernal] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=2e7c59eb, uptime=00:13:00.077] > ^-- H/N/C [hosts=1, nodes=1, CPUs=8] > ^-- CPU [cur=-100%, avg=-100%, GC=0%] > ^-- PageMemory [pages=15] > ^-- Heap [used=24MB, free=97.62%, comm=1024MB] > ^-- Off-heap [used=0MB, free=100%, comm=4396MB] > ^-- sysMemPlc region [used=0MB, free=99.99%, comm=100MB] > ^-- default region [used=0MB, free=100%, comm=4096MB] > ^-- metastoreMemPlc region [used=0MB, free=99.95%, comm=100MB] > ^-- TxLog region [used=0MB, free=100%, comm=100MB] > ^-- Ignite persistence [used=0MB] > ^-- sysMemPlc region [used=0MB] > ^-- default region [used=0MB] > ^-- metastoreMemPlc region [used=unknown] > ^-- TxLog region [used=0MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=6, qSize=0] > [15:58:44,763][INFO][grid-timeout-worker-#23][IgniteKernal] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=2e7c59eb, uptime=00:14:00.079] > ^-- H/N/C [hosts=1, nodes=1, CPUs=8] > ^-- CPU [cur=-100%, avg=-100%, GC=0%] > ^-- PageMemory [pages=15] > ^-- Heap [used=27MB, free=97.33%, comm=1024MB] > ^-- Off-heap [used=0MB, free=100%, comm=4396MB] > ^-- sysMemPlc region [used=0MB, free=99.99%, comm=100MB] > ^-- default region [used=0MB, free=100%, comm=4096MB] > ^-- metastoreMemPlc region [used=0MB, free=99.95%, comm=100MB] > ^-- TxLog region [used=0MB, free=100%, comm=100MB] > ^-- Ignite persistence [used=0MB] > ^-- sysMemPlc region [used=0MB] > ^-- default region [used=0MB] > ^-- metastoreMemPlc region [used=unknown] > ^-- TxLog region [used=0MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=6, qSize=0] > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
