[ 
https://issues.apache.org/jira/browse/IGNITE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184131#comment-16184131
 ] 

Vladimir Ozerov commented on IGNITE-5569:
-----------------------------------------

Moved to 2.4 due to inactivity.

> TCP Discovery SPI allows multiple NODE_JOINED / NODE_FAILED leading to a 
> cluster DDoS
> -------------------------------------------------------------------------------------
>
>                 Key: IGNITE-5569
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5569
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.7
>            Reporter: Alexey Goncharuk
>            Assignee: Dmitry Karachentsev
>             Fix For: 2.4
>
>
> A firewall configuration issue may effectively lead to a cluster DDoS. The 
> scheme is as follows:
> 1) A node G joins the cluster, and a firewall rule forbids incoming 
> connection from cluster to this node
> 2) Cluster successfully processes NodeAddedMesage and fires a discovery 
> NODE_JOINED event (not sure why?)
> 4) The last node in the ring fails to connect to the newly joined node and 
> generates NODE_FAILED event
> 5) Coordinator drops the connection, joining node attempts to connect again
> The issues I see here:
> 1) Neither coordinator nor joining node print out the reason why the joining 
> node failed / did not join. A slight hint (failed to send message to the next 
> node) is printed on the node with the largest order (the one that attempted 
> to close the ring), but the root cause (connection refused) is also not 
> printed
> 2) The joining node attempts to connect to the cluster with the same node ID. 
> This violates an invariant we heavily rely on that once a node ID leaves a 
> cluster, this ID never comes back again
> 3) Each discovery event leads to a partition exchange which blocks all cache 
> operations for a time interval equal at least to the full ring latency time. 
> If several nodes are started on a malicious host, this may lead to almost 
> full cluster degradation



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to