[ 
https://issues.apache.org/jira/browse/CASSANDRA-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18087639#comment-18087639
 ] 

Dipankar Achinta commented on CASSANDRA-21190:
----------------------------------------------

[~chrisjmiller] The ~12-second timeout appears to be governed by Cassandra's 
native transport deadline ({*}native_transport_timeout{*}, default: 
{*}12000ms{*}) rather than {*}read_request_timeout{*}.

Because request deadlines are computed as the minimum of the *operation/verb 
timeout* and the {*}native transport timeout{*}, increasing 
_read_request_timeout_ alone does not affect this specific timeout path.

Cassandra's internal *native_transport_timeout* participates in the request 
deadline computation via {_}Dispatcher.computeDeadline(){_}. It considers both,
 * The operation-specific timeout (e.g. {_}read_request_timeout{_})
 * A client/native transport timeout ({_}native_transport_timeout{_})


The effective deadline is finally computed as the earlier of the two:
{code:java}
return Math.min(verbDeadline, clientDeadline); {code}
 
The client timeout/deadline represents how long the sender of a request is 
prepared to wait for a response and is always measured from the point when a 
request is received and enqueued.

And as such the deadline becomes (see {*}Dispatcher.java#clientDeadline{*}):
{code:java}
return enqueuedAtNanos() + 
DatabaseDescriptor.getNativeTransportTimeout(TimeUnit.NANOSECONDS);{code}

If we measure timeout from when the request was received, then the 
coordinator's total lifetime budget is bounded by the client's native transport 
timeout (12s by default). Time spent waiting in queues counts against that 
budget.

A request that spends too long in the coordinator, whether queued or executing, 
will hit the deadline at {*}12 seconds{*}, even if _read_request_timeout_ has 
been increased to {*}45 seconds{*}.

In Cassandra 4.1.6, *_native_transport_timeout_* does not appear to be 
configurable via {_}cassandra.yaml{_}.

Cassandra 4.1.6 does expose a JMX operation 
({*}StorageServiceMBean#setNativeTransportTimeoutMillis{*}) that updates the 
value at runtime.

However, if there a cap for the timeout in the client driver itself, then 
increasing the server-side timeout won't help much imo. As per the code 
comments, 3.0 Cassandra Driver has its "read" timeout set to 12 seconds, and 
hence the *native transport timeout* tries to match that. Since, the client's 
_ReadTimeoutHandler_ times out after _readTimeoutMillis_ (12s by default), so 
any coordinator work beyond that point is unlikely to be useful because the 
client has already stopped waiting.

 

> Seeing timeout 11999 msec/cross-node but how to change the associated 
> parameter
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21190
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21190
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Chris Miller
>            Priority: Urgent
>
> Hi folks, 
> We're seeing timeouts such as the following on our production system and need 
> to increase the associated parameter (but there doesn't seem to be an 
> equivalent parameter). Any ideas how to increase the 11999 msec timeout.
> <SELECT ..., total time 12003 msec, timeout 11999 msec/cross-node
> Cassandra version: 4.1.6
> Thanks, 
> Chris.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to