[ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--------------------------------------
    Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber. And the 
node addresses are sorted out consistently. This affection on failure detection 
should be noted in the documentation.

The suggestion is to represent this behavior in 
https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:

"You should assing multiple addresses to a node only if they represent some 
real physical connections which can give more reliability. Providing several 
addresses can prolong failure detection of current node. The timeouts and 
settings on network operations (_failureDetectionTimeout(), sockTimeout, 
ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
exception is _connRecoveryTimeout_. And node addresses are sorted out 
sequentially.
     Example: if you use _failureDetectionTimeout _and have set 3 ip addresses 
for this node, previous node iт  the ring can take up to 
'failureDetectionTimeout * 3' to detect failure of current node."



  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.

The suggestion is to represent this behavior in 
https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:

"You should assing multiple addresses to a node only if they represent some 
real physical connections which can give more reliability. Providing several 
addresses can prolong failure detection of current node. The timeouts and 
settings on network operations (_failureDetectionTimeout(), sockTimeout, 
ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
exception is _connRecoveryTimeout_. And node addresses are sorted out 
sequentially.
     Example: if you use _failureDetectionTimeout _and have set 3 ip addresses 
for this node, previous node iт  the ring can take up to 
'failureDetectionTimeout * 3' to detect failure of current node."




> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---------------------------------------------------------------------------------------
>
>                 Key: IGNITE-13206
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13206
>             Project: Ignite
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Vladimir Steshin
>            Assignee: Vladimir Steshin
>            Priority: Minor
>              Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber. And the 
> node addresses are sorted out consistently. This affection on failure 
> detection should be noted in the documentation.
> The suggestion is to represent this behavior in 
> https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:
> "You should assing multiple addresses to a node only if they represent some 
> real physical connections which can give more reliability. Providing several 
> addresses can prolong failure detection of current node. The timeouts and 
> settings on network operations (_failureDetectionTimeout(), sockTimeout, 
> ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
> exception is _connRecoveryTimeout_. And node addresses are sorted out 
> sequentially.
>      Example: if you use _failureDetectionTimeout _and have set 3 ip 
> addresses for this node, previous node iт  the ring can take up to 
> 'failureDetectionTimeout * 3' to detect failure of current node."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to