[
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821200#comment-17821200
]
Stefan Miklosovic commented on CASSANDRA-19429:
-----------------------------------------------
Well, I do see some speedup, just not 2x / 3x, I think this change is amplified
more powerful machine a node runs on
without this patch
{noformat}
Op rate : 89,392 op/s [READ: 80,439 op/s, WRITE: 8,953
op/s]
Partition rate : 89,392 pk/s [READ: 80,439 pk/s, WRITE: 8,953
pk/s]
Row rate : 89,392 row/s [READ: 80,439 row/s, WRITE: 8,953
row/s]
Latency mean : 2.2 ms [READ: 2.2 ms, WRITE: 2.2 ms]
Latency median : 1.6 ms [READ: 1.6 ms, WRITE: 1.6 ms]
Latency 95th percentile : 5.8 ms [READ: 5.8 ms, WRITE: 5.9 ms]
Latency 99th percentile : 10.6 ms [READ: 10.6 ms, WRITE: 10.7 ms]
Latency 99.9th percentile : 23.5 ms [READ: 23.4 ms, WRITE: 23.8 ms]
Latency max : 180.5 ms [READ: 180.5 ms, WRITE: 122.7 ms]
Total partitions : 5,408,128 [READ: 4,866,473, WRITE: 541,655]
Total errors : 0 [READ: 0, WRITE: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:01:00
{noformat}
with this patch, two independent runs:
{noformat}
Op rate : 119,782 op/s [READ: 107,849 op/s, WRITE: 11,933
op/s]
Partition rate : 119,782 pk/s [READ: 107,849 pk/s, WRITE: 11,933
pk/s]
Row rate : 119,782 row/s [READ: 107,849 row/s, WRITE: 11,933
row/s]
Latency mean : 1.7 ms [READ: 1.6 ms, WRITE: 1.7 ms]
Latency median : 1.3 ms [READ: 1.3 ms, WRITE: 1.4 ms]
Latency 95th percentile : 3.8 ms [READ: 3.8 ms, WRITE: 4.0 ms]
Latency 99th percentile : 7.7 ms [READ: 7.7 ms, WRITE: 8.0 ms]
Latency 99.9th percentile : 13.7 ms [READ: 13.7 ms, WRITE: 14.1 ms]
Latency max : 114.6 ms [READ: 61.5 ms, WRITE: 114.6 ms]
Total partitions : 7,188,152 [READ: 6,472,051, WRITE: 716,101]
Total errors : 0 [READ: 0, WRITE: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:01:00
{noformat}
{noformat}
Results:
Op rate : 104,456 op/s [READ: 94,016 op/s, WRITE: 10,440
op/s]
Partition rate : 104,456 pk/s [READ: 94,016 pk/s, WRITE: 10,440
pk/s]
Row rate : 104,456 row/s [READ: 94,016 row/s, WRITE: 10,440
row/s]
Latency mean : 1.9 ms [READ: 1.9 ms, WRITE: 2.0 ms]
Latency median : 1.5 ms [READ: 1.4 ms, WRITE: 1.5 ms]
Latency 95th percentile : 4.7 ms [READ: 4.6 ms, WRITE: 4.8 ms]
Latency 99th percentile : 8.6 ms [READ: 8.6 ms, WRITE: 8.8 ms]
Latency 99.9th percentile : 13.9 ms [READ: 13.8 ms, WRITE: 14.1 ms]
Latency max : 85.4 ms [READ: 77.2 ms, WRITE: 85.4 ms]
Total partitions : 6,268,822 [READ: 5,642,258, WRITE: 626,564]
Total errors : 0 [READ: 0, WRITE: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:01:00
{noformat}
so the speedup it like 20% which is quite nice already.
> Remove lock contention generated by getCapacity function in SSTableReader
> -------------------------------------------------------------------------
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
> Issue Type: Bug
> Components: Local/SSTable
> Reporter: Dipietro Salvatore
> Assignee: Dipietro Salvatore
> Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png,
> asprof_cass4.1.3__lock_20240216052912lock.html
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock
> acquires is measured in the `getCapacity` function from
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04),
> this limits the CPU utilization of the system to under 50% when testing at
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing
> the call to `getCapacity` with `size` achieves up to 2.95x increase in
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar &&
> CASSANDRA_USE_JDK11=true ant stress-build && rm -rf data && bin/cassandra -f
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write
> n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log
> -graph file=cload.html && \
> bin/nodetool compact keyspace1 && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph
> file=graph.html
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]