[ https://issues.apache.org/jira/browse/IGNITE-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638436#comment-14638436 ]
Denis Magda commented on IGNITE-882: ------------------------------------ The error arose because when a node starts shutting down itself it stops to transfer messages to its next node at some point. Fixed, reviewed by Yakov, merged to the main development branch. > Node can join twice with the same ID > ------------------------------------ > > Key: IGNITE-882 > URL: https://issues.apache.org/jira/browse/IGNITE-882 > Project: Ignite > Issue Type: Bug > Components: general > Reporter: Semen Boikov > Assignee: Dmitriy Setrakyan > Priority: Critical > Fix For: sprint-7 > > Attachments: 882.patch > > > Observed in the test > 'GridCacheColocatedFailoverSelfTest.testOptimisticRepeatableReadTxConstantTopologyChange': > Node joined: > {noformat} > [15:53:24,163][INFO > ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager] > Added new node to topology: TcpDiscoveryNode > [id=10cf7906-50af-4f46-9c31-baf419539001, addrs=[127.0.0.1], > sockAddrs=[/127.0.0.1:47525], discPort=47525, order=400, intOrder=202, > loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false] > {noformat} > Node failed: > {noformat} > [15:53:24,171][WARN > ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager] > Node FAILED: TcpDiscoveryNode [id=10cf7906-50af-4f46-9c31-baf419539001, > addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47525], discPort=47525, order=400, > intOrder=202, loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false] > {noformat} > This see this message from the thread starting new node: > {noformat} > [15:53:29,047][WARN ][topology-change-thread-1][TcpDiscoverySpi] Node has not > been connected to topology and will repeat join process. Check remote nodes > logs for possible error messages. Note that large topology may require > significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' > configuration property if getting this message on the starting nodes > [networkTimeout=5000] > {noformat} > Node joined again with the same ID: > {noformat} > [15:53:29,212][INFO > ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager] > Added new node to topology: TcpDiscoveryNode > [id=10cf7906-50af-4f46-9c31-baf419539001, addrs=[127.0.0.1], > sockAddrs=[/127.0.0.1:47525], discPort=47525, order=404, intOrder=205, > loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false] > {noformat} > Then test hangs (in the log I see that future mapped on the node > '10cf7906-50af-4f46-9c31-baf419539001' did not finish). > The same issue observed in tests extending > GridCacheAbstractNodeRestartSelfTest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)