[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

Stefan Miklosovic (Jira) Tue, 27 Feb 2024 05:50:09 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821245#comment-17821245
 ]


Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 1:43 PM:
------------------------------------------------------------------------

[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and then you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x .... hardly.


was (Author: smiklosovic):
[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and they you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x .... hardly.

> Remove lock contention generated by getCapacity function in SSTableReader
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19429
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/SSTable
>            Reporter: Dipietro Salvatore
>            Assignee: Dipietro Salvatore
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x
>
>         Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

Reply via email to