[jira] [Updated] (IGNITE-13014) Remove double checking of node availability.

Vladimir Steshin (Jira) Thu, 28 May 2020 08:25:33 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vladimir Steshin updated IGNITE-13014:
--------------------------------------
    Ignite Flags: Release Notes Required  (was: Docs Required,Release Notes 
Required)

> Remove double checking of node availability. 
> ---------------------------------------------
>
>                 Key: IGNITE-13014
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13014
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vladimir Steshin
>            Assignee: Vladimir Steshin
>            Priority: Major
>              Labels: iep-45
>         Attachments: FailureDetectionResearch.txt, NodeFailureResearch.patch, 
> WostCaseStepByStep.txt
>
>
> Proposal:
> Do not check failed node second time. Double node checking prolongs node 
> failure detection and gives no additional benefits. There are mesh and 
> hardcoded values in this routine.
> For the present, we have double checking of node availability. Let's imagine 
> node 2 doesn't answer any more. Node 1 becomes unable to ping node 2 and asks 
> Node 3 to establish permanent connection instead of node 2. Node 3 may try to 
> check node 2 too. Or may not.
> Possible long detection of node failure up to ServerImpl.CON_CHECK_INTERVAL + 
> 2 * IgniteConfiguretion.failureDetectionTimeout + 300ms. 
> See:
> * ‘NodeFailureResearch.patch'. It creates test 'FailureDetectionResearch' 
> which emulates long answears on a failed node and measures failure detection 
> delays.
> * 'FailureDetectionResearch.txt' - results of the test.
> * 'WostCaseStepByStep.txt' - description how the worst case happens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13014) Remove double checking of node availability.

Reply via email to