Daouda Sarr created CASSANDRA-21450:
---------------------------------------

             Summary: Performance regression (read quorom) after upgrade from 
Cassandra 4.1.3 to 4.1.11
                 Key: CASSANDRA-21450
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21450
             Project: Apache Cassandra
          Issue Type: Bug
          Components: Consistency/Coordination, Consistency/Repair
            Reporter: Daouda Sarr


 Apache Cassandra version
- Baseline version: 4.1.3
- Regressed version: 4.1.11
 
Cluster topology
 
The cluster is deployed across three datacenters:
 
Datacenter | Nodes | Replication Factor | Location
----------|-------|--------------------|---------
DC1           | 10    | RF=2                          | Southern site
DC2          | 10    | RF=2                          | Southern site (close 
proximity to DC1)
DC3          | 5     | RF=1                            | Northern site 
(significantly farther from DC1/DC2)
 
Keysapce replication strategy:
- DC1 = 2
- DC2 = 2
- DC3 = 1
 
Total replication factor = 5.
 
Applications are deployed only in DC1 and DC2.  
Client drivers use contact points exclusively from DC1 and DC2. No application 
traffic originates from DC3.
 
The issue is observed from application workloads running in DC1/DC2.
 
The cluster topology, replication settings, application deployment, and client 
configuration remained unchanged throughout all tests.
 
The only variable changed between tests is the Cassandra version:
- 4.1.3 → normal performance
- 4.1.11 → performance degradation
- rollback to 4.1.3 → performance restored
- upgrade again to 4.1.11 → degradation reproduced
 
Description :
After upgrading from Cassandra 4.1.3 to 4.1.11, we observed a significant 
increase in latency for read queries.
 
The affected queries are generated by a batch/scanning  with CL=QUORUM
 
After upgrading to 4.1.11, Cassandra frequently logs slow cross-node reads, 
seem Cassandra wait something in DC3 even if QUORUM can be achieve with 4 
copies in DC1/DC2:
 
slow timeout 500 msec/cross-node with observed latencies typically between 530 
ms and 780 ms.
 
Example: "was slow 2 times: avg/min/max 761/741/780 msec"
 
Thousands of similar messages are generated:
... (6149 were dropped)
 
 
The table is very small:
- Estimated partitions: 814
- SSTable count: 3
- Live size: ~112 KB
- No compaction pressure
- No dropped mutations
- Very low memory footprint
 
Observed metrics on Cassandra 4.1.11
 
nodetool proxyhistograms:
- Read latency P99: ~8 ms
- Range latency P99: ~183 ms
- Maximum range latency: ~263 ms
 
nodetool tablestats:
- Local read latency: ~0.150 ms
- Local write latency: ~0.015 ms
- Bloom filter false ratio: 0.00000
- SSTable count: 3
 
Additional observations
 
- The issue does not appear related to table size, compaction, SSTables, or 
tombstones.
- Local read performance remains excellent.
- The regression is observed specifically after upgrading to 4.1.11.
- The issue disappears immediately after rolling back to 4.1.3.
 
Performance comparison
 
We have collected performance measurements and monitoring graphs for both 
versions under comparable production workloads.
 
Observations:
- 4.1.3: stable baseline, no slow query alerts
- 4.1.11: significant increase in token range scan latency and slow query logs
- rollback to 4.1.3: performance returns to baseline
- re-upgrade to 4.1.11: degradation reproduced
 
We also have monitoring graphs showing:
- latency (P50 / P95 / P99)
- rate of slow queries
- throughput (ops/sec)
- clear correlation with Cassandra version changes
 
The degradation is reproducible and follows the Cassandra version changes 
exactly. Returning to 4.1.3 restores the original performance without any other 
infrastructure, application, schema, or configuration changes.
 
Expected behavior :
 
Performance of read queries should remain consistent between Cassandra 4.1.3 
and 4.1.11 for identical workload and cluster topology.
 
Questions
 
1. Are there known changes between 4.1.3 and 4.1.11 affecting:
   - token range scan execution
   - cross-node read coordination
   - QUORUM read behavior in multi-DC setups
   - speculative retry interactions with range reads
 
2. Is this a known regression ?
 
3. Are there specific diagnostics (tracing, JMX metrics) that could help 
identify the root cause ?
 
We can provide:
- full logs (before/after upgrade)
- tracing outputs
- performance graphs
- cluster configuration snapshots



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to