[ 
https://issues.apache.org/jira/browse/IGNITE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandr Kuramshin reassigned IGNITE-6700:
------------------------------------------

    Assignee: Alexandr Kuramshin  (was: Semen Boikov)

> Node considered as failed can cause failure of others nodes
> -----------------------------------------------------------
>
>                 Key: IGNITE-6700
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6700
>             Project: Ignite
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>          Components: general
>            Reporter: Semen Boikov
>            Assignee: Alexandr Kuramshin
>            Priority: Critical
>
> Node considered as failed can cause failure of others nodes in cluster. 
> There is an issue in TcpDiscoveryAbstractMessage.failedNodes processing, if 
> message is received from node considered as failed, then failedNodes should 
> be ignored.
> Possible scenario:
> - there are 4 nodes (1 -> 2 -> 3 -> 4)
> - node 3 temporary lost connection with others
> - node 2 considers 3 as failed, node failed event is fired for 3
> - node 3 considers 4 as failed, adds 4 in nodeFailedList, then it restores 
> connection with 1 and currently 1 will process nodeFailedList from 3 (even if 
> 3 is already considered as failed)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to