[ 
https://issues.apache.org/jira/browse/IGNITE-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638436#comment-14638436
 ] 

Denis Magda commented on IGNITE-882:
------------------------------------

The error arose because when a node starts shutting down itself it stops to 
transfer messages to its next node at some point. 

Fixed, reviewed by Yakov, merged to the main development branch.

> Node can join twice with the same ID
> ------------------------------------
>
>                 Key: IGNITE-882
>                 URL: https://issues.apache.org/jira/browse/IGNITE-882
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>            Reporter: Semen Boikov
>            Assignee: Dmitriy Setrakyan
>            Priority: Critical
>             Fix For: sprint-7
>
>         Attachments: 882.patch
>
>
> Observed in the test 
> 'GridCacheColocatedFailoverSelfTest.testOptimisticRepeatableReadTxConstantTopologyChange':
> Node joined:
> {noformat}
> [15:53:24,163][INFO 
> ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager]
>  Added new node to topology: TcpDiscoveryNode 
> [id=10cf7906-50af-4f46-9c31-baf419539001, addrs=[127.0.0.1], 
> sockAddrs=[/127.0.0.1:47525], discPort=47525, order=400, intOrder=202, 
> loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false]
> {noformat}
> Node failed:
> {noformat}
> [15:53:24,171][WARN 
> ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager]
>  Node FAILED: TcpDiscoveryNode [id=10cf7906-50af-4f46-9c31-baf419539001, 
> addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47525], discPort=47525, order=400, 
> intOrder=202, loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false]
> {noformat}
> This see this message from the thread starting new node:
> {noformat}
> [15:53:29,047][WARN ][topology-change-thread-1][TcpDiscoverySpi] Node has not 
> been connected to topology and will repeat join process. Check remote nodes 
> logs for possible error messages. Note that large topology may require 
> significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' 
> configuration property if getting this message on the starting nodes 
> [networkTimeout=5000]
> {noformat}
> Node joined again with the same ID:
> {noformat}
> [15:53:29,212][INFO 
> ][disco-event-worker-#121%dht.GridCacheColocatedFailoverSelfTest0%][GridDiscoveryManager]
>  Added new node to topology: TcpDiscoveryNode 
> [id=10cf7906-50af-4f46-9c31-baf419539001, addrs=[127.0.0.1], 
> sockAddrs=[/127.0.0.1:47525], discPort=47525, order=404, intOrder=205, 
> loc=false, ver=1.0.3#19700101-sha1:00000000, isClient=false]
> {noformat}
> Then test hangs (in the log I see that future mapped on the node 
> '10cf7906-50af-4f46-9c31-baf419539001' did not finish).
> The same issue observed in tests extending 
> GridCacheAbstractNodeRestartSelfTest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to