João Reis created CASSGO-125:
--------------------------------
Summary: Many "Pool connection error" with small Session.Timeout
Key: CASSGO-125
URL: https://issues.apache.org/jira/browse/CASSGO-125
Project: Apache Cassandra Go driver
Issue Type: Bug
Components: Core
Reporter: João Reis
{quote}Version of driver: *v2.0.0* (latest)
Go version: *1.24*
Cassandra version: *5.0.4*
If set global {{Session}} {{Timeout}} equal {{1*time.Second}} or less in logs
we can see a lot of messages:
{{"message":"Pool connection error.","addr":"10.239.171.143:9042","err":"read
tcp 10.221.55.21:53299->10.229.171.143:9042: i/o timeout"}}
The same thing happens with long timeouts, but less often.
I tracked the full error flow:
{{Conn.serve}} -> {{Conn.recv}} -> {{Conn.processFrame}} -> {{readHeader}} ->
{{connReader.Read}} -> {{Conn.serve}} -> {{Conn.closeWithError}}
->{{{}hostConnPool.HandleError {}}}
So the driver keeps getting read/timeout errors and closes the connection with
error and establishes a new one (and so on all the time).
The most surprising thing is that Cassandra's response time doesn't exceed 40ms
at 99.9p. The go client isn't under any load.
Also I've found, the Heartbeat connection is established with a general
timeout, not a Connect timeout (which isn't entirely accurate). However,
debugging has shown that this isn't the reason the connection is closed.
Constant client-side reconnects put a strain on Cassandra.
{quote}
User report in [https://github.com/apache/cassandra-gocql-driver/issues/1919]
We should get rid of the connection read deadline altogether. Maybe in a follow
up ticket we can do a more comprehensive rework of the timeout settings (maybe
add a readtimeout if someone really wants a read deadline for some reason).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]