Stan,

What is the purpose of clientFailureDetectionTimeout? Why can't we just
always use failureDetectionTimeout? Is there any difference between these
two timeouts?

-Val



On Wed, Jul 4, 2018 at 7:00 AM Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Hi,
>
> I’ve updated the proposed documentation update with a description of
> metricsUpdateFrequency and a detailed description of
> failureDetectionTimeout and clientFailureDetectionTimeout relations. The
> draft is attached to https://issues.apache.org/jira/browse/IGNITE-7704.
>
> It seems that relation between failureDetectionTimeout and
> clientFailureDetectionTimeout is currently too tricky and should also be
> changed in future.
> The problem is that in a server-client connection the server will use
> clientFailureDetectionTimeout but client will use failureDetectionTimeout.
> In other words, clients ignore clientFailureDetectionTimeout and just use
> failureDetectionTimeout. Because of that, one has to provide different
> values of failureDetectionTimeout in server and client configs which seems
> confusing and inconvenient.
> So I’d like to add one more point to my earlier proposal:
>
> 5. Always use clientFailureDetectionTimeout on clients instead of
> failureDetectionTimeout
> *What*: change code to use clientFailureDetectionTimeout on clients
> *When*: update code and readme.io docs in 2.7
>
> Thanks,
> Stan
>
> From: Valentin Kulichenko
> Sent: 30 мая 2018 г. 19:09
> To: dev@ignite.apache.org
> Subject: Re: IgniteConfiguration, TcpDiscoverySpi,
> TcpCommunicationSpitimeouts
>
> Stan,
>
> Looks like you suggest to only change the default. If so, it's OK. But
> let's not change the behavior of these timeouts for the case they are
> explicitly set in config.
>
> Thanks,
> Val
>
> On Wed, May 30, 2018 at 1:06 AM, Stanislav Lukyanov <
> stanlukya...@gmail.com>
> wrote:
>
> > On networkTimeout: no, we don’t have anything like that in
> > TcpCommunicationSpi.
> >
> > On socketWriteTimeout:
> > First, its semantic is very close to TcpDicsoverySpi.socketTimeout (with
> > the exception that communication uses NIO), and the latter defaults to
> > failureDetectionTimeout,
> > so I think it would help to avoid confusion.
> > Second, I think we can’t deprecate something without an alternative that
> > would work for most users.
> > On the other hand, if we do default socketWriteTimeout to
> > failureDetectionTimeout then we reach a pretty decent API state
> > where one only needs two properties in IgniteConfiguration neither of
> > which we’re considering for deprecation and removal in 3.0.
> >
> > Stan
> >
> > From: Valentin Kulichenko
> > Sent: 29 мая 2018 г. 22:17
> > To: dev@ignite.apache.org
> > Subject: Re: IgniteConfiguration, TcpDiscoverySpi,
> > TcpCommunicationSpitimeouts
> >
> > Stan,
> >
> > OK, I got confused a little :)
> >
> > I do agree that TcpDiscoverySpi.networkTimeout should inherit from
> > IgniteConfiguration.networkTImeout if not set explicitly. Do we have the
> > same setting for TcpCommunicationSpi, BTW? If yes, behavior should be
> > consistent.
> >
> > As for TcpCommunicationSpi.socketWriteTimeout, I'm not sure why you want
> > to
> > change its behavior. Can we just deprecate it and eventually remove, just
> > as we plan to do for all timeouts from #2?
> >
> > -Val
> >
> > On Tue, May 29, 2018 at 3:50 AM, Stanislav Lukyanov <
> > stanlukya...@gmail.com>
> > wrote:
> >
> > > Val,
> > >
> > > Which timeouts do you mean?
> > >
> > > In #2 I don’t propose to change behavior.
> > >
> > > I propose to change behavior for a couple of settings in #3 though.
> > > I believe the correct approach here would be to target the behavior
> > change
> > > for 2.6,
> > > but keep in mind that we’ll need to carefully analyze the impact before
> > > actually making the changes.
> > >
> > > Thanks,
> > > Stan
> > >
> > > From: Valentin Kulichenko
> > > Sent: 29 мая 2018 г. 0:57
> > > To: dev@ignite.apache.org
> > > Subject: Re: IgniteConfiguration, TcpDiscoverySpi,
> > > TcpCommunicationSpitimeouts
> > >
> > > Hi Stan,
> > >
> > > I'm 100% for this activity, however I don't think we should change the
> > > behavior of timeouts you listed in #2 - this can lead to unexpected
> > > behavior for users who already use them. I would just deprecate them
> and
> > > eventually remove.
> > >
> > > -Val
> > >
> > > On Mon, May 28, 2018 at 1:29 PM, Stanislav Lukyanov <
> > > stanlukya...@gmail.com>
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > It looks like we stopped half-way with this activity. I’d like to
> pick
> > it
> > > > up.
> > > >
> > > > All seem to agree that we should simplify the timeout settings.
> > > > Here are the specific actions I’d like to propose:
> > > >
> > > > 1. Promote the use of global timeouts as the best practice
> > > > *What*: update the docs to encourage users to rely on the following
> > > > timeouts for their “network stability” settings
> > > > IgniteConfiguration.failureDetectionTimeout
> > > > IgniteConfiguration.clientFailureDetectionTimeout
> > > > IgniteConfiguration.networkTimeout
> > > > *When*: update readme.io docs for 2.5 and Javadoc for 2.6
> > > >
> > > > 2. Discourage the use of finer timeouts
> > > > *What*:
> > > > - update the docs to discourage users to use the following timeouts
> and
> > > > announce their upcoming deprecation and removal
> > > > TcpDiscoverySpi.socketTimeout
> > > > TcpDiscoverySpi.ackTimeout
> > > > TcpDiscoverySpi.maxAckTimeout
> > > > TcpDiscoverySpi.reconnectCount
> > > > TcpCommunicationSpi.connectTimeout
> > > > TcpCommunicationSpi.maxConnectTimeout
> > > > TcpCommunicationSpi.reconnectCount
> > > > - deprecate the properties in code
> > > > - remove the properties in code
> > > > *When*:
> > > > - readme.io update with deprecation announcement for 2.5
> > > > - @Deprecated in code + Javadoc update + respective readme.io
> > rewording
> > > > for 2.6
> > > > - properties removal in 3.0
> > > >
> > > > 3. Make “orphan” timeouts rely on global timeouts, then deprecate and
> > > > remove
> > > > *What*:
> > > > Two settings currently don’t default to the global equivalents,
> > although
> > > > they should:
> > > > - TcpCommunicationSpi.socketWriteTimeout should default to
> > > > failureDetectionTimeout
> > > > - TcpDiscoverySpi.networkTimeout should default to
> IgniteConfiguration.
> > > > networkTImeout
> > > > So the course of action would be:
> > > > - update the docs to explain that these timeouts have to be used for
> > now,
> > > > but announce their upcoming deprecation and removal
> > > > - change the properties to default to their global counterparts and
> > > > deprecate them in code
> > > > - remove the properties in code
> > > > *When*:
> > > > - readme.io update with deprecation announcement for 2.5
> > > > - changing defaults + @Deprecated in code + Javadoc update +
> respective
> > > > readme.io rewording for 2.6
> > > > - properties removal in 3.0
> > > >
> > > > 4. Don’t touch other timeouts
> > > > Other timeouts, like TcpDiscoverySpi.joinTimeout or
> > TcpCommunicationSpi.
> > > idleConnectionTimeout,
> > > > are orthogonal to the whole
> > > > “network stability” theme discussed above, and don’t have to be
> > changed.
> > > >
> > > > Finally, I’ve prepared a draft of the docs page that may be used as a
> > > base
> > > > for the readme.io update.
> > > > This email is pretty long already, so please find the draft attached
> to
> > > > the JIRA issue
> > > > https://issues.apache.org/jira/browse/IGNITE-7704.
> > > >
> > > > Please share your thoughts.
> > > >
> > > > Thanks,
> > > > Stan
> > > >
> > > > From: Alexey Popov
> > > > Sent: 1 марта 2018 г. 17:01
> > > > To: dev@ignite.apache.org
> > > > Subject: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi
> > > timeouts
> > > >
> > > > Hi Igniters,
> > > >
> > > > We often see similar questions from users and customers related to
> > > > IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts
> and
> > > > their
> > > > relations. And we see several side-effects after incorrect timeout
> > > > configuration.
> > > >
> > > > I tried to briefly describe these timeout settings (please see below)
> > and
> > > > found out that the most of them do not have sense in terms of cluster
> > > > functions/operations and could not be explained to the users.
> > > >
> > > > I propose to deprecate most of them and leave only the timeouts we
> can
> > > > explain in common terms ( (setFailureDetectionTimeout,
> > setNetworkTimeout,
> > > > setJoinTimeout and some others).
> > > >
> > > > Please let me know your thoughts.
> > > >
> > > > Thanks,
> > > > Alexey
> > > >
> > > > GLOBAL:
> > > >
> > > > IgniteConfiguration.setNetworkTimeout:
> > > > It is a global timeout for high-level operations where a network is
> > > > involved. For instance, IgniteMessaging delivery uses this timeout or
> > > > DiscoverySpi handshake.
> > > >
> > > > IgniteConfiguration.setFailureDetectionTimeout:
> > > > It is a global timeout for detecting failures at IgniteSpi
> > > implementations
> > > > (including DiscoverySpi and CommunicationSpi).
> > > > The failure detection algorithm actually limits a range of simple
> > network
> > > > operations related to a single logical operation (for instance, a
> > > reliable
> > > > delivery of some DiscoverySpi message within a cluster).
> > > > Failure detection timeout is a cumulative timeout for a socket
> > > connection,
> > > > sending and receiving data bytes and all possible socket retries (if
> > some
> > > > failure happens).
> > > > This timeout is intended to simplify the failure detection condition
> > > from a
> > > > user perspective.
> > > >
> > > > IgniteConfiguration.setClientFailureDetectionTimeout: - it is a
> > special
> > > > case
> > > > for DiscoverySpi client-node Ignite.
> > > >
> > > > TCP DISCOVERY SPI:
> > > >
> > > > If you need more control over failure detection algorithm for
> > > > TcpDiscoverySpi you can explicitly use the following low-level
> options
> > > > (that
> > > > will disable failureDetectoinTimeout logic):
> > > >
> > > > 1. TcpDiscoverySpi.setConnectTimeout - socket connection timeout
> > > > 2. TcpDiscoverySpi.setReconnectCount - number of reconnect attempts
> > used
> > > > when establishing connection with the remote node and sending
> messages
> > to
> > > > it
> > > > 3. TcpDiscoverySpi.setSocketTimeout - socket write timeout. The write
> > > > operation will be repeated getReconnectCount() times if it exceeds
> this
> > > > timeout
> > > > 4. TcpDiscoverySpi.setAckTimeout - message acknowledgment timeout.
> If a
> > > > message acknowledgment is not received within this timeout, sending
> is
> > > > considered as failed and SPI will try to repeat send operation. It is
> > > > automatically doubled for simultaneous retries up to getMaxAckTimeout
> > > > value.
> > > > 5. TcpDiscoverySpi.setMaxAckTimeout - maximum connection timeout, if
> > the
> > > > getAckTimeout reaches getMaxAckTimeout then SPI give up sending
> retries
> > > >
> > > > Another important TcpDiscoverySpi timeouts:
> > > >
> > > > TcpDiscoverySpi.setJoinTimeout - It is a timeout for join process
> when
> > a
> > > > new/restarted node joins a cluster. The node tries to connect to all
> > > > available IP addresses provided by ipFinder within this timeout.
> > > > If the timeout is exceeded, the node will give up and throw an
> > exception
> > > > from Ignition.start().
> > > >
> > > > TcpDiscoverySpi.setNetworkTimeout - timeout for high-level operations
> > > like
> > > > handshake. It looks like it should be deprecated and the
> > > > IgniteConfiguration.getNetworkTimeout should be used here.
> > > >
> > > > TCP COMMUNICATION SPI:
> > > >
> > > > If you need more control over failure detection algorithm for
> > > > TcpCommunicationSpi you can explicitly use the following low-level
> > > options
> > > > (that will disable failureDetectoinTimeout logic):
> > > >
> > > > 1. TcpCommunicationSpi.setConnectTimeout - socket connection timeout,
> > > will
> > > > be automatically doubled for simultaneous retries (up to
> > > getReconnectCount)
> > > > related to a single logical operation
> > > > 2. TcpCommunicationSpi.setMaxConnectTimeout - maximum connection
> > > timeout,
> > > > the higher limit of getReconnectCount-times doubled getConnectTimeout
> > > > 3. TcpCommunicationSpi.setReconnectCount - number of reconnect
> > attempts
> > > > used
> > > > when establishing connection with the remote node and sending
> messages
> > to
> > > > it
> > > >
> > > > Another important TcpCommunicationSpi timeouts:
> > > >
> > > > TcpDiscoverySpi.setSocketWriteTimeout - timeout to send a message.
> > > > TcpDiscoverySpi.setIdleConnectionTimeout - maximum idle connection
> > > timeout
> > > > upon which a connection will be closed.
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >
> > > >
> > >
> > >
> >
> >
>
>

Reply via email to