Re: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts

Valentin Kulichenko Thu, 01 Mar 2018 17:12:07 -0800

+1. Low level timeouts that we still have in discovery and communication
are very hard to explain and I doubt there is anyone who fully understands
how they currently work. They bring a lot of complexity and almost zero
value. Let's deprecate them and leave only failureDetectionTimeout plus
other high level settings that Alexey mentioned.


-Val

On Thu, Mar 1, 2018 at 6:06 AM, Ilya Kasnacheev <[email protected]>
wrote:

> I agree with you.
>
> I think we could restrict usage of e.g. setConnectTimeout/setSocketTimeout
> to people extending SPIs, since different implementations may need
> different values.
>
> However, for user configurations we should only expose timeouts we can
> explain, everything else should have reasonable values.
>
> --
> Ilya Kasnacheev
>
> 2018-03-01 17:01 GMT+03:00 Alexey Popov <[email protected]>:
>
> > Hi Igniters,
> >
> > We often see similar questions from users and customers related to
> > IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts and
> > their
> > relations. And we see several side-effects after incorrect timeout
> > configuration.
> >
> > I tried to briefly describe these timeout settings (please see below) and
> > found out that the most of them do not have sense in terms of cluster
> > functions/operations and could not be explained to the users.
> >
> > I propose to deprecate most of them and leave only the timeouts we can
> > explain in common terms ( (setFailureDetectionTimeout, setNetworkTimeout,
> > setJoinTimeout and some others).
> >
> > Please let me know your thoughts.
> >
> > Thanks,
> > Alexey
> >
> > GLOBAL:
> >
> > IgniteConfiguration.setNetworkTimeout:
> > It is a global timeout for high-level operations where a network is
> > involved. For instance, IgniteMessaging delivery uses this timeout or
> > DiscoverySpi handshake.
> >
> > IgniteConfiguration.setFailureDetectionTimeout:
> > It is a global timeout for detecting failures at IgniteSpi
> implementations
> > (including DiscoverySpi and CommunicationSpi).
> > The failure detection algorithm actually limits a range of simple network
> > operations related to a single logical operation (for instance, a
> reliable
> > delivery of some DiscoverySpi message within a cluster).
> > Failure detection timeout is a cumulative timeout for a socket
> connection,
> > sending and receiving data bytes and all possible socket retries (if some
> > failure happens).
> > This timeout is intended to simplify the failure detection condition
> from a
> > user perspective.
> >
> > IgniteConfiguration.setClientFailureDetectionTimeout: - it is a special
> > case
> > for DiscoverySpi client-node Ignite.
> >
> > TCP DISCOVERY SPI:
> >
> > If you need more control over failure detection algorithm for
> > TcpDiscoverySpi you can explicitly use the following low-level options
> > (that
> > will disable failureDetectoinTimeout logic):
> >
> > 1. TcpDiscoverySpi.setConnectTimeout - socket connection timeout
> > 2. TcpDiscoverySpi.setReconnectCount - number of reconnect attempts used
> > when establishing connection with the remote node and sending messages to
> > it
> > 3. TcpDiscoverySpi.setSocketTimeout - socket write timeout. The write
> > operation will be repeated getReconnectCount() times if it exceeds this
> > timeout
> > 4. TcpDiscoverySpi.setAckTimeout - message acknowledgment timeout. If a
> > message acknowledgment is not received within this timeout, sending is
> > considered as failed and SPI will try to repeat send operation. It is
> > automatically doubled for simultaneous retries up to getMaxAckTimeout
> > value.
> > 5. TcpDiscoverySpi.setMaxAckTimeout - maximum connection timeout, if the
> > getAckTimeout reaches getMaxAckTimeout then SPI give up sending retries
> >
> > Another important TcpDiscoverySpi timeouts:
> >
> > TcpDiscoverySpi.setJoinTimeout - It is a timeout for join process when a
> > new/restarted node joins a cluster. The node tries to connect to all
> > available IP addresses provided by ipFinder within this timeout.
> > If the timeout is exceeded, the node will give up and throw an exception
> > from Ignition.start().
> >
> > TcpDiscoverySpi.setNetworkTimeout - timeout for high-level operations
> like
> > handshake. It looks like it should be deprecated and the
> > IgniteConfiguration.getNetworkTimeout should be used here.
> >
> > TCP COMMUNICATION SPI:
> >
> > If you need more control over failure detection algorithm for
> > TcpCommunicationSpi you can explicitly use the following low-level
> options
> > (that will disable failureDetectoinTimeout logic):
> >
> > 1. TcpCommunicationSpi.setConnectTimeout - socket connection timeout,
> will
> > be automatically doubled for simultaneous retries (up to
> getReconnectCount)
> > related to a single logical operation
> > 2. TcpCommunicationSpi.setMaxConnectTimeout - maximum connection
> timeout,
> > the higher limit of getReconnectCount-times doubled getConnectTimeout
> > 3. TcpCommunicationSpi.setReconnectCount - number of reconnect attempts
> > used
> > when establishing connection with the remote node and sending messages to
> > it
> >
> > Another important TcpCommunicationSpi timeouts:
> >
> > TcpDiscoverySpi.setSocketWriteTimeout - timeout to send a message.
> > TcpDiscoverySpi.setIdleConnectionTimeout - maximum idle connection
> timeout
> > upon which a connection will be closed.
> >
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
>

Re: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts

Reply via email to