[jira] [Commented] (IGNITE-7704) Document IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts and their relations

Alexey Popov (JIRA) Wed, 14 Feb 2018 04:48:46 -0800

    [ 
https://issues.apache.org/jira/browse/IGNITE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363879#comment-16363879
 ]


Alexey Popov commented on IGNITE-7704:
--------------------------------------

Sample description:

IgniteConfiguration.setNetworkTimeout:
It is a global timeout for high-level operations where a network is involved. 
For instance, IgniteMessaging delivery uses this timeout or DiscoverySpi 
handshake.
This timeout could be overridden by more specific SPI settings, for example, 
TcpDiscoverySpi.setNetworkTimeout.

IgniteConfiguration.setFailureDetectionTimeout:
It is a global timeout for detecting failures at IgniteSpi implementations 
(including DiscoverySpi and CommunicationSpi).
The failure detection algorithm actually limits a range of simple network 
operations related to a single logical operation (for instance, a reliable 
delivery of some DiscoverySpi message within a cluster).
Failure detection timeout is a cumulative timeout for a socket connection, 
sending and receiving data bytes and all possible socket retries (if some 
failure happens). 
This timeout is intended to simplify the failure detection condition from a 
user perspective. 
If you need more control over failure detection algorithm you can explicitly 
use the following low-level options (that will disable failureDetectoinTimeout 
logic):

1. TcpDiscoverySpi.setConnectTimeout - socket connection timeout, will be 
automatically doubled for simultaneous retries (up to getReconnectCount) 
related to a single logical operation 
2. TcpDiscoverySpi.setMaxConnectTimeout - maximum connection timeout, the 
higher limit of getReconnectCount-times doubled getConnectTimeout
3. TcpDiscoverySpi.setReconnectCount - number of reconnect attempts used when 
establishing connection with the remote node and sending messages to it
4. TcpDiscoverySpi.setSocketTimeout - socket write timeout. The write operation 
will be repeated getReconnectCount() times if it exceeds this timeout
5. TcpDiscoverySpi.setAckTimeout - message acknowledgment timeout. If a message 
acknowledgment is not received within this timeout, sending is considered as 
failed and SPI will try to repeat send operation. It is automatically doubled 
for simultaneous retries up to getMaxAckTimeout value.
6. TcpDiscoverySpi.setMaxAckTimeout - maximum connection timeout, if the 
getAckTimeout reaches getMaxAckTimeout then SPI give up sending retries

Another important TcpDiscoverySpi timeouts:

TcpDiscoverySpi.setJoinTimeout - It is a timeout for join process when a 
new/restarted node joins a cluster. The node tries to connect to all available 
IP addresses provided by ipFinder within this timeout.
If the timeout is exceeded, the node will give up and throw an exception from 
Ignition.start().

> Document IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts 
> and their relations
> -----------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-7704
>                 URL: https://issues.apache.org/jira/browse/IGNITE-7704
>             Project: Ignite
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 2.3
>            Reporter: Alexey Popov
>            Priority: Major
>
> We often see similar questions related to IgniteConfiguration, 
> TcpDiscoverySpi, TcpCommunicationSpi timeouts and their relations. And we see 
> several side-effects after incorrect timeout configuration.
> It looks like this question is not well documented.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IGNITE-7704) Document IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts and their relations

Reply via email to