Dear Rajini,
Thanks for the feedback.
1)
Because "request.timeout.ms" only affects in-flight requests, after the API
NetworkClient.ready() is invoked, the connection won't get closed after
"request.timeout.ms” hits. Before
a) the SocketChannel is connected
b) ssl handshake finished
c) authentication has finished (sasl)
clients cannot invoke NetworkClient.send() to send any request, which means no
in-flight request targeting to the connection will be added.
2)
I think a default value of 127 seconds make sense, which meets the timeout
indirectly specified by the default value of “tcp.syn.retries”. I’ve added this
into the KIP proposal.
3)
Every time the timeout hits, the timeout value of the next connection try will
increase.
The timeout will hit iff a connection stays at the `connecting` state longer
than the timeout value, as indicated by
ClusterConnectionStates.NodeConnectionState. The connection state of a node
will change iff `SelectionKey.OP_CONNECT` is detected by
`nioSelector.Select()`. The connection state may transit from `connecting` to
a) `disconnected` when SocketChannel.finishConnect() throws IOException.
b) `connected` when SocketChannel.finishConnect() return TRUE.
In other words, the timeout will hit and increase iff the interested
SelectionKey.OP_CONNECT doesn't trigger before the timeout arrives, which
means, for example, network congestion, failure of the ARP request, packet
filtering, routing error, or a silent discard may happen. (I didn’t read the
Java NIO source code. Please correct me the case when OP_CONNECT won’t get
triggered if I’m wrong)
4)
A) Connection timeout dominates both request timeout and API timeout
When connection timeout hits, the connection will be closed. The client will be
notified either by the responses constructed by NetworkClient or the callbacks
attached to the request. As a result, the request failure will be handled
before either connection timeout or API timeout arrives.
B) Neither request timeout or API timeout dominates connection timeout
i) Request timeout: Because request timeout only affects in-flight requests,
after the API NetworkClient.ready() is invoked, the connection won't get closed
after "request.timeout.ms” hits. Before
1. the SocketChannel is connected
2. SSL handshake finished
3. authentication has finished (SASL)
, clients won't be able to invoke NetworkClient.send() to send any request,
which means no in-flight request targeting to the connection will be added.
ii) API timeout: In AdminClient, API timeout acts by putting a smaller and
smaller timeout value to the chain of requests in a same API. After the API
timeout hits, the retry logic won't close any connection. In consumer, API
timeout acts as a whole by putting a limit to the code block executing time.
The retry logic won't close any connection as well.
Conclusion:
Thanks again for the long feedback and I’m always enjoying them. I’ve
supplement the above discussion into the KIP proposal. Please let me know what
you think.
Best, - Cheng Tan
> On Jun 2, 2020, at 3:01 AM, Rajini Sivaram <[email protected]> wrote:
>
> Hi Cheng,
>
> Not sure if the discussion should move back to the DISCUSS thread. I have a
> few questions:
>
> 1) The KIP motivation says that in some cases `request.timeout.ms` doesn't
> timeout connections properly and as a result it takes 127s to detect a
> connection failure. This sounds like a bug rather than a limitation of the
> current approach. Can you explain the scenarios where this occurs?
>
> 2) I think the current proposal is to use non-exponential 10s connection
> timeout as default with the option to use exponential timeout. So
> connection timeouts for every connection attempt will be between 8s and 12s
> by default. Is that correct? Should we use a default max timeout to enable
> exponential timeout by default since 8s seems rather small?
>
> 3) What is the scope of `failures` used to determine connection timeout
> with exponential timeouts? Will we always use 10s followed by 20s every
> time a connection is attempted?
>
> 4) It will be good if we can include two flows with the relationship
> between various timeouts in the KIP. One with a fixed node like a typical
> produce/consume request to the leader and another that uses
> `leastLoadedNode` like a metadata request. Having the comparison between
> the current and proposed behaviour w.r.t all configurable timeouts (the two
> new connection timeouts, request timeout, api timeout etc.) will be useful.
>
> Regards,
>
> Rajini
>