[ 
https://issues.apache.org/jira/browse/CASSANDRA-21165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18059093#comment-18059093
 ] 

Harsh Desai commented on CASSANDRA-21165:
-----------------------------------------

To reiterate, the TRACE output—provided as an attachment for the same CQL run 
executed separately — clearly indicates that a LARGE_MESSAGE is being 
transmitted from the replica node. This observation aligns with the load 
testing logs shared earlier. Taken together, these findings seem to provide 
compelling evidence to support the observed anomaly.

> Query read timeout potentially due to altered query on server side
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-21165
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21165
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Harsh Desai
>            Priority: Urgent
>         Attachments: CQL_TRACING_Output.txt
>
>
> During load testing of Cassandra 5.0.6 cluster, we came across an unusual 
> issue wherein a lightweight CQL query times out.
> Upon further analysis, it was found that the query being executed on the 
> server side does not seem to be the same as the one sent by driver.
>  
> {+}Client side code{+}:
> this.statement = session.prepare(SimpleStatement.newInstance("SELECT column1 
> from \"kspace\".\"tsTable\" WHERE key = ? AND key2 = ? ORDER BY column1 DESC 
> LIMIT 1").setIdempotent(true));
>  
> {+}Cassandra server audit logs{+}:
> FileAuditLogger.java:51 - 
> ...|type:REQUEST_FAILURE|category:ERROR|ks:kspace|scope:tsTable|operation:SELECT
>  column1 from "kspace"."tsTable" WHERE key = ? AND key2 = ? ORDER BY column1 
> DESC LIMIT 1; Operation timed out - received only 1 responses.
>  
> {+}Cassandra server logs{+}:
> NoSpamLogger.java:104 - ...ReadTimeoutException "Operation timed out - 
> received only 1 responses." while executing SELECT {color:#ff0000}*{color} 
> FROM "kspace"."tsTable" WHERE key = c001c5c2-f0a7-1046-115d-edb4b67ab0d9 AND 
> key2 = '2026-02' ORDER BY column1 DESC, {color:#ff0000}*column2 ASC, column3 
> DESC, column4 DESC*{color} LIMIT 1 {color:#ff0000}*ALLOW FILTERING*{color}
>  
> {+}Replica node logs{+}:
> .. [WARN ] [ReadStage-68] cluster_id=1 ip_address=1.1.1.1  
> NoSpamLogger.java:107 - /2.2.2.2:7000->/3.3.3.3:7000-LARGE_MESSAGES-2acb4e9d 
> overloaded; dropping 1.779MiB message (queue: 131.653MiB local, 127.653MiB 
> endpoint, 127.653MiB global)
> {+}Table Schema{+}:
>  
> ||Column||Type||Key type||
> |key|TIMEUUID|Partition Key|
> |key2|TEXT|Partition Key|
> |column1|BIGINT|Clustering Column ASC|
> |column2|TIMEUUID|Clustering Column DESC|
> |column3|BOOLEAN|Clustering Column ASC|
> |column4|TEXT|Clustering Column ASC|
> |value|BLOB| |
>  
> Attached is the CQL query TRACING output (executed separately) which shows 
> that a message being transmitted from the replica node is the large one.
> Evidently, the query sent by the driver is quite light-weight while the one 
> executed on the server is not, as it tries to fetch all the columns including 
> the blob which is not asked for. This might be supported by the fact that the 
> message happens to be a large one and hence dropped. Besides, the query runs 
> with “ALLOW FILTERING” unexpectedly which is detrimental to the query 
> performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to