[
https://issues.apache.org/jira/browse/IGNITE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Pavlov updated IGNITE-8785:
-----------------------------------
Fix Version/s: (was: 2.6)
2.7
> Node may hang indefinitely in CONNECTING state during cluster segmentation
> --------------------------------------------------------------------------
>
> Key: IGNITE-8785
> URL: https://issues.apache.org/jira/browse/IGNITE-8785
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.5
> Reporter: Pavel Kovalenko
> Priority: Major
> Fix For: 2.7
>
>
> Affected test:
> org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest#testTopologyValidatorWithCacheGroup
> Node hangs with following stacktrace:
> {noformat}
> "grid-starter-testTopologyValidatorWithCacheGroup-22" #117619 prio=5
> os_prio=0 tid=0x00007f17dd19b800 nid=0x304a in Object.wait()
> [0x00007f16b19df000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:931)
> - locked <0x0000000705ee4a60> (a java.lang.Object)
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373)
> at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948)
> at
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915)
> at
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1739)
> at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1046)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723)
> - locked <0x0000000705995ec0> (a
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:649)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:882)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:845)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:833)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:799)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest$3.call(GridAbstractTest.java:742)
> at
> org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
> {noformat}
> It seems that node never receives acknowledgment from coordinator.
> There were some failure before:
> {noformat}
> [org.apache.ignite:ignite-core] [2018-06-10 04:59:18,876][WARN
> ][grid-starter-testTopologyValidatorWithCacheGroup-22][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi]
> Node has not been connected to topology and will repeat join process. Check
> remote nodes logs for possible error messages. Note that large topology may
> require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout'
> configuration property if getting this message on the starting nodes
> [networkTimeout=5000]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)