[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-05-05 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843588#comment-17843588
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 5/5/24 8:51 PM:
---

[~rustyrazorblade] there is 63 instances in the production codebase where we do 
not check tracing level and just log. We should take more holistic approach 
here and just fix this everywhere which would be probably better suited for a 
new ticket.

Anyway, I think we should still deliver this with the changes OP suggested (and 
myself improved on top of that). This seems like a fairly innocent change and I 
do not see where it might go wrong. We should double check that our 
understanding of getSize vs getCapacity is correct though. 


was (Author: smiklosovic):
[~rustyrazorblade] there is 63 instances in the production codebase where we do 
not check tracing level and just log. We should take more holistic approach 
here and just fix this everywhere which would be probably better suited for a 
new ticket.

Anyway, I think we should still deliver this with the changes OP suggested (and 
myself improved on top of that). This seems like a fairy innocent change and I 
do not see where it might go wrong. We should double check that our 
understanding of getSize vs getCapacity is correct though. 

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, Screenshot 2024-03-19 at 15.22.50.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html, 
> image-2024-03-08-15-51-30-439.png, image-2024-03-08-15-52-07-902.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-05-05 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843588#comment-17843588
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 5/5/24 8:50 PM:
---

[~rustyrazorblade] there is 63 instances in the production codebase where we do 
not check tracing level and just log. We should take more holistic approach 
here and just fix this everywhere which would be probably better suited for a 
new ticket.

Anyway, I think we should still deliver this with the changes OP suggested (and 
myself improved on top of that). This seems like a fairy innocent change and I 
do not see where it might go wrong. We should double check that our 
understanding of getSize vs getCapacity is correct though. 


was (Author: smiklosovic):
[~rustyrazorblade] there is 63 instances in the production codebase where we do 
not check tracing level and just log. We should take more holistic approach 
here and just fix this everywhere which would be probably better suited of a 
new ticket.

Anyway, I think we should still deliver this with the changes OP suggested (and 
myself improved on top of that). This seems like a fairy innocent change and I 
do not see where it might go wrong. We should double check that our 
understanding of getSize vs getCapacity is correct though. 

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, Screenshot 2024-03-19 at 15.22.50.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html, 
> image-2024-03-08-15-51-30-439.png, image-2024-03-08-15-52-07-902.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-03-11 Thread Dipietro Salvatore (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825442#comment-17825442
 ] 

Dipietro Salvatore edited comment on CASSANDRA-19429 at 3/11/24 8:57 PM:
-

> I'm working with a single node, with compaction disabled, and I reduced my 
> memtable space to 16MB in order to constantly flush.  I wrote 10m rows and 
> have 1928 SStables.  These boxes have 72 CPU.  I'm using G1GC with a 24GB 
> heap.  I've tested concurrent_reads at 64 and 128 since there's enough cores 
> on here to handle and we don't need to bottleneck on reads.

[~rustyrazorblade] Can you provided full detailed and step-by-step instructions 
on how you tested? So I will try to reproduce as well on the r8g and r7i 
instance  types.
BTW which Cassandra version did you use?  AFAIK, Cassandra 4 uses 
UseConcMarkSweepGC and not G1GC. Am I right?


was (Author: JIRAUSER304377):
> I'm working with a single node, with compaction disabled, and I reduced my 
> memtable space to 16MB in order to constantly flush.  I wrote 10m rows and 
> have 1928 SStables.  These boxes have 72 CPU.  I'm using G1GC with a 24GB 
> heap.  I've tested concurrent_reads at 64 and 128 since there's enough cores 
> on here to handle and we don't need to bottleneck on reads.

[~rustyrazorblade] Can you provided full detailed and step-by-step instructions 
on how you tested? So I will try to reproduce as well.
BTW which Cassandra version did you use?  AFAIK, Cassandra 4 uses 
UseConcMarkSweepGC and not G1GC. Am I right?

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, asprof_cass4.1.3__lock_20240216052912lock.html, 
> image-2024-03-08-15-51-30-439.png, image-2024-03-08-15-52-07-902.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-03-08 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824885#comment-17824885
 ] 

Jon Haddad edited comment on CASSANDRA-19429 at 3/8/24 11:37 PM:
-

When I try to spin up those instance types in us-west-2 I get an error that 
they're invalid, so I'm running a test with c5.18xlarge.

I'm working with a single node, with compaction disabled, and I reduced my 
memtable space to 16MB in order to constantly flush.  I wrote 10m rows and have 
1928 SStables.  These boxes have 72 CPU.  I'm using G1GC with a 24GB heap.  
I've tested concurrent_reads at 64 and 128 since there's enough cores on here 
to handle and we don't need to bottleneck on reads.

So, right off the bat, I'm not able to duplicate the original observation about 
CPU not going over 50% utilization.  4.1 has reached 90+% CPU utilization 
consistently:
{noformat}
23:29:57     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
23:29:58     all   90.11    0.00    1.07    0.00    0.00    0.24    0.01    
0.00    0.00    8.57
23:29:59     all   90.12    0.00    0.84    0.00    0.00    0.27    0.00    
0.00    0.00    8.77
23:30:00     all   89.82    0.00    0.83    0.03    0.00    0.38    0.01    
0.00    0.00    8.93
23:30:01     all   90.13    0.03    1.08    0.00    0.00    0.30    0.00    
0.00    0.00    8.47
23:30:02     all   89.95    0.00    0.89    0.00    0.00    0.34    0.01    
0.00    0.00    8.82
23:30:03     all   89.86    0.00    1.08    0.00    0.00    0.24    0.00    
0.00    0.00    8.83
23:30:04     all   87.90    0.00    0.97    0.00    0.00    0.24    0.01    
0.00    0.00   10.88 {noformat}
Using a variety of easy-cass-stress KeyValue workloads with different settings 
for --rate, I'm unable to see any meaningful difference, performance-wise.
{noformat}
easy-cass-stress run KeyValue -d 20m --rate 20k -p 10m -t 16 -r 1{noformat}
For each workload I've run, I've seen virtually identical results.  Both are 
pushing C* to use 90% CPU and achieve roughly 25K reads / second.


was (Author: rustyrazorblade):
When I try to spin up those instance types in us-west-2 I get an error that 
they're invalid, so I'm running a test with c5.18xlarge.

I'm working with a single node, with compaction disabled, and I reduced my 
memtable space to 16MB in order to constantly flush.  I wrote 10m rows and have 
1928 SStables.  These boxes have 72 CPU.  I'm using G1GC with a 24GB heap.  
I've tested concurrent_reads at 64 and 128 since there's enough cores on here 
to handle and we don't need to bottleneck on reads.

So, right off the bat, I'm not able to duplicate the original observation about 
CPU not going over 50% utilization.  4.1 has reached 90+% CPU utilization:
{noformat}
23:29:57     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
23:29:58     all   90.11    0.00    1.07    0.00    0.00    0.24    0.01    
0.00    0.00    8.57
23:29:59     all   90.12    0.00    0.84    0.00    0.00    0.27    0.00    
0.00    0.00    8.77
23:30:00     all   89.82    0.00    0.83    0.03    0.00    0.38    0.01    
0.00    0.00    8.93
23:30:01     all   90.13    0.03    1.08    0.00    0.00    0.30    0.00    
0.00    0.00    8.47
23:30:02     all   89.95    0.00    0.89    0.00    0.00    0.34    0.01    
0.00    0.00    8.82
23:30:03     all   89.86    0.00    1.08    0.00    0.00    0.24    0.00    
0.00    0.00    8.83
23:30:04     all   87.90    0.00    0.97    0.00    0.00    0.24    0.01    
0.00    0.00   10.88 {noformat}
Using a variety of easy-cass-stress KeyValue workloads with different settings 
for --rate, I'm unable to see any meaningful difference, performance-wise.
{noformat}
easy-cass-stress run KeyValue -d 20m --rate 20k -p 10m -t 16 -r 1{noformat}
For each workload I've run, I've seen virtually identical results.  Both are 
pushing C* to use 90% CPU and achieve roughly 25K reads / second.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based 

[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821437#comment-17821437
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 10:22 PM:
-

How is it so? Because we are making changes in SSTableReader which _reads_ 
right? So reading path should be affected.

We see the performance differs only in mixed workloads? So it seems like read 
and write path try to acquire same lock and this is what the patch workarounds?

What happens when you use 50:50 reads:writes? I would say that the overall 
number of operations will be dis-proportionally smaller.


was (Author: smiklosovic):
How is it so? Because we are making changes in SSTableReader which _reads_ 
right? So reading path should be affected.

We see the performance differs only in mixed workloads? So it seems like read 
and write path try to acquire same lock and this is what the patch workarounds?

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821437#comment-17821437
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 10:21 PM:
-

How is it so? Because we are making changes in SSTableReader which _reads_ 
right? So reading path should be affected.

We see the performance differs only in mixed workloads? So it seems like read 
and write path try to acquire same lock and this is what the patch workarounds?


was (Author: smiklosovic):
How is it so? Because we are making changes in SSTableReader which _reads_ 
right? So reading path should be affected.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821245#comment-17821245
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 1:44 PM:


[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and then you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

Also dont do mixed workload, do just reads.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x  hardly.


was (Author: smiklosovic):
[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and then you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x  hardly.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821245#comment-17821245
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 1:43 PM:


[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and then you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x  hardly.


was (Author: smiklosovic):
[~dipiets] I think the mistake you do is that you do "nodetool compact" after 
you write the data and they you run mixed workload against that.

nodetool compact will make just 1 SSTable instead of whatever number of them 
you have there.

I tested this locally too and, indeed, if you have just one table, that will be 
way more performance-friendly than having multiple of them, because ... there 
is just one table. So Cassandra does not need to look into any other - which is 
more performant by definition. 

If you do not run nodetool compact after writes, try to also do nodetool 
disableautocompaction, then run the tests. After it is finished, compact it all 
into one SSTable and run it again. You will see that it is slower and I do not 
think that looking into capacity etc has anything to do with it.

I mean ... sure, we see less locking etc and we might add that change there, 
but in general I do not think that the effect it has is 2x  hardly.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821196#comment-17821196
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 12:02 PM:
-

I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

But yeah ... I am not running this on r8g.24xlarge or r7i.24xlarge .


was (Author: smiklosovic):
I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

But yeah ... I am not running this on r8g.24xlarge or r7i.24xlarge but I would 
expect that I would also see some speedup already, no? Even on some Ryzen 7 and 
running it in Docker.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821200#comment-17821200
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 12:02 PM:
-

Well, I do see some speedup, just not 2x / 3x, I think this change is amplified 
more powerful machine a node runs on

without this patch

{noformat}
Op rate   :   89,392 op/s  [READ: 80,439 op/s, WRITE: 8,953 
op/s]
Partition rate:   89,392 pk/s  [READ: 80,439 pk/s, WRITE: 8,953 
pk/s]
Row rate  :   89,392 row/s [READ: 80,439 row/s, WRITE: 8,953 
row/s]
Latency mean  :2.2 ms [READ: 2.2 ms, WRITE: 2.2 ms]
Latency median:1.6 ms [READ: 1.6 ms, WRITE: 1.6 ms]
Latency 95th percentile   :5.8 ms [READ: 5.8 ms, WRITE: 5.9 ms]
Latency 99th percentile   :   10.6 ms [READ: 10.6 ms, WRITE: 10.7 ms]
Latency 99.9th percentile :   23.5 ms [READ: 23.4 ms, WRITE: 23.8 ms]
Latency max   :  180.5 ms [READ: 180.5 ms, WRITE: 122.7 ms]
Total partitions  :  5,408,128 [READ: 4,866,473, WRITE: 541,655]
Total errors  :  0 [READ: 0, WRITE: 0]
Total GC count: 0
Total GC memory   : 0.000 KiB
Total GC time :0.0 seconds
Avg GC time   :NaN ms
StdDev GC time:0.0 ms
Total operation time  : 00:01:00
{noformat}

with this patch, two independent runs:

{noformat}
Op rate   :  119,782 op/s  [READ: 107,849 op/s, WRITE: 11,933 
op/s]
Partition rate:  119,782 pk/s  [READ: 107,849 pk/s, WRITE: 11,933 
pk/s]
Row rate  :  119,782 row/s [READ: 107,849 row/s, WRITE: 11,933 
row/s]
Latency mean  :1.7 ms [READ: 1.6 ms, WRITE: 1.7 ms]
Latency median:1.3 ms [READ: 1.3 ms, WRITE: 1.4 ms]
Latency 95th percentile   :3.8 ms [READ: 3.8 ms, WRITE: 4.0 ms]
Latency 99th percentile   :7.7 ms [READ: 7.7 ms, WRITE: 8.0 ms]
Latency 99.9th percentile :   13.7 ms [READ: 13.7 ms, WRITE: 14.1 ms]
Latency max   :  114.6 ms [READ: 61.5 ms, WRITE: 114.6 ms]
Total partitions  :  7,188,152 [READ: 6,472,051, WRITE: 716,101]
Total errors  :  0 [READ: 0, WRITE: 0]
Total GC count: 0
Total GC memory   : 0.000 KiB
Total GC time :0.0 seconds
Avg GC time   :NaN ms
StdDev GC time:0.0 ms
Total operation time  : 00:01:00
{noformat}

{noformat}
Results:
Op rate   :  104,456 op/s  [READ: 94,016 op/s, WRITE: 10,440 
op/s]
Partition rate:  104,456 pk/s  [READ: 94,016 pk/s, WRITE: 10,440 
pk/s]
Row rate  :  104,456 row/s [READ: 94,016 row/s, WRITE: 10,440 
row/s]
Latency mean  :1.9 ms [READ: 1.9 ms, WRITE: 2.0 ms]
Latency median:1.5 ms [READ: 1.4 ms, WRITE: 1.5 ms]
Latency 95th percentile   :4.7 ms [READ: 4.6 ms, WRITE: 4.8 ms]
Latency 99th percentile   :8.6 ms [READ: 8.6 ms, WRITE: 8.8 ms]
Latency 99.9th percentile :   13.9 ms [READ: 13.8 ms, WRITE: 14.1 ms]
Latency max   :   85.4 ms [READ: 77.2 ms, WRITE: 85.4 ms]
Total partitions  :  6,268,822 [READ: 5,642,258, WRITE: 626,564]
Total errors  :  0 [READ: 0, WRITE: 0]
Total GC count: 0
Total GC memory   : 0.000 KiB
Total GC time :0.0 seconds
Avg GC time   :NaN ms
StdDev GC time:0.0 ms
Total operation time  : 00:01:00
{noformat}

so the speedup is like 20% which is quite nice already.


was (Author: smiklosovic):
Well, I do see some speedup, just not 2x / 3x, I think this change is amplified 
more powerful machine a node runs on

without this patch

{noformat}
Op rate   :   89,392 op/s  [READ: 80,439 op/s, WRITE: 8,953 
op/s]
Partition rate:   89,392 pk/s  [READ: 80,439 pk/s, WRITE: 8,953 
pk/s]
Row rate  :   89,392 row/s [READ: 80,439 row/s, WRITE: 8,953 
row/s]
Latency mean  :2.2 ms [READ: 2.2 ms, WRITE: 2.2 ms]
Latency median:1.6 ms [READ: 1.6 ms, WRITE: 1.6 ms]
Latency 95th percentile   :5.8 ms [READ: 5.8 ms, WRITE: 5.9 ms]
Latency 99th percentile   :   10.6 ms [READ: 10.6 ms, WRITE: 10.7 ms]
Latency 99.9th percentile :   23.5 ms [READ: 23.4 ms, WRITE: 23.8 ms]
Latency max   :  180.5 ms [READ: 180.5 ms, WRITE: 122.7 ms]
Total partitions  :  5,408,128 [READ: 4,866,473, WRITE: 541,655]
Total errors  :  0 [READ: 0, WRITE: 0]
Total GC count: 0
Total GC memory   : 0.000 KiB
Total GC time :0.0 seconds
Avg GC time   :NaN ms
StdDev GC time:0.0 ms
Total operation time  : 00:01:00
{noformat}

with this patch, two independent runs:

{noformat}
Op rate   :  119,782 

[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821196#comment-17821196
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 11:52 AM:
-

I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

But yeah ... I am not running this on r8g.24xlarge or r7i.24xlarge but I would 
expect that I would also see some speedup already, no? Even on some Ryzen 7 and 
running it in Docker.


was (Author: smiklosovic):
I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

But yeah ... I am not running this on r8g.24xlarge or r7i.24xlarge but I would 
expect that I would also seem some speedup already, no? Even on some Ryzen 7 
and running it in Docker.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-27 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821196#comment-17821196
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/27/24 11:51 AM:
-

I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

But yeah ... I am not running this on r8g.24xlarge or r7i.24xlarge but I would 
expect that I would also seem some speedup already, no? Even on some Ryzen 7 
and running it in Docker.


was (Author: smiklosovic):
I dont see the speedup, I was just testing same commands on my local PC, before 
/ after numbers are basically more or less same, definitely not 2x or 3x. 

I used just 100 threads.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-26 Thread Dipietro Salvatore (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820812#comment-17820812
 ] 

Dipietro Salvatore edited comment on CASSANDRA-19429 at 2/26/24 6:52 PM:
-

My performance benchmarks results looks good to me:
r8g.24xl:  458k op/s  (2.72x)
r7i.24xl:  292k op/s    (1.9x)

Tested with commands:
{code:java}
cd 
git clone https://github.com/instaclustr/cassandra.git cassandra-instaclustr
cd cassandra-instaclustr
git checkout CASSANDRA-19429-4.1
git log -10 --oneline

CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f -R

bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && bin/cqlsh -e 'drop 
keyspace if exists keyspace1;' && bin/nodetool clearsnapshot --all && 
tools/bin/cassandra-stress write n=1000 cl=ONE -rate threads=384 -node 
127.0.0.1 -log file=cload.log -graph file=cload.html && bin/nodetool compact 
keyspace1   && sleep 30s && tools/bin/cassandra-stress mixed 
ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost 
-log file=result.log -graph file=graph.html |& tee stress.txt{code}


was (Author: JIRAUSER304377):
My performance benchmarks results looks good to me:
r8g.24xl:  458k op/s  (2.72x)
r7i.24xl:  292k op/s    (1.9x)

Tested with commands:
{code:java}
cd 
git clone https://github.com/instaclustr/cassandra.git cassandra-instaclustr
cd cassandra-instaclustr
git checkout CASSANDRA-19429-4.1git log -10 --oneline

CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f -R

bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && bin/cqlsh -e 'drop 
keyspace if exists keyspace1;' && bin/nodetool clearsnapshot --all && 
tools/bin/cassandra-stress write n=1000 cl=ONE -rate threads=384 -node 
127.0.0.1 -log file=cload.log -graph file=cload.html && bin/nodetool compact 
keyspace1   && sleep 30s && tools/bin/cassandra-stress mixed 
ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost 
-log file=result.log -graph file=graph.html |& tee stress.txt{code}

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-26 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820696#comment-17820696
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/26/24 12:25 PM:
-

When one checks if (tracing) ..., there are more cases like that in the 
affected classes so we just align it to what is there and it definitely does 
not hurt.

Also, I think that if we base that logic on DatabaseDescriptor, we should also 
cover MBean aspect of it.

Also here (1) the logic of "if capacity is zero then we dont cache" is used 
already

https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L2354


was (Author: smiklosovic):
When one checks if (tracing) ..., there are more cases like that in the 
affected classes so we just align it to what is there and it definitely does 
not hurt.

Also, I think that if we base that logic on DatabaseDescriptor, we should also 
cover MBean aspect of it.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-26 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820628#comment-17820628
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/26/24 8:27 AM:


I created this branch where I incorporated the idea above with more changes 
related to it (1). For example, we should not call getCapacity() in tracing log 
messages, we should firstly check if we are going to log on trace level and 
then construct log message. If we are not, then we call getCapacity() but we 
just throw it away. I think that in practice we are not logging on trace at all 
so this is just redundant.

When we check DatabaseDescriptor.getKeyCacheSizeInMiB(), if we change capacity 
of these caches via CacheServiceMBean, then it would have non-zero capacity but 
we never set it in DatabaseDescriptor. I covered this too, DatabaseDescriptor 
values are updated on CacheServiceMBean method calls too.

(1) https://github.com/apache/cassandra/pull/3133/files

[~dipiets] [~brandon.williams] does this make sense to you? [~dipiets] Could 
you please run your tests again with the (1) patch and check the numbers?


was (Author: smiklosovic):
I created this branch where I incorporated the idea above with more changes 
related to it (1). For example, we should no call getCapacity() in tracing log 
messages, we should firstly check if we are going to log on trace level and 
then construct log message. If we are not, then we call getCapacity() but we 
just throw it away. I think that in practice we are not logging on trace at all 
so this is just redundant.

When we check DatabaseDescriptor.getKeyCacheSizeInMiB(), if we change capacity 
of these caches via CacheServiceMBean, then it would have non-zero capacity but 
we never set it in DatabaseDescriptor. I covered this too, DatabaseDescriptor 
values are updated on CacheServiceMBean method calls too.

(1) https://github.com/apache/cassandra/pull/3133/files

[~dipiets] [~brandon.williams] does this make sense to you? [~dipiets] Could 
you please run your tests again with the (1) patch and check the numbers?

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:44 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

I am not sure you can check size instead of capacity like that.

"capacity" is the maximum size of all entries (in mebibytes if I am not 
mistaken) you can put into that cache.

"size" is the current number of entries in that cache.

So that "if (maybeKeyCache.getCapacity > 0)" basically checks if keyCache is 
enabled or not. I think that checking it via "maybeKeyCache.size() > 0" is a 
little bit misleading. I think that checking it via DatabaseDescriptor's 
property is better.

You can have a cache with positive capacity but that cache might hold 0 entries 
so in that case you would evaluate that "hey, we dont have cache" which is not 
technically true.


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

I am not sure you can check size instead of capacity like that.

"capacity" is the maximum number of entries you can put into that cache.

"size" is the current number of entries in that cache.

So that "if (maybeKeyCache.getCapacity > 0)" basically checks if keyCache is 
enabled or not. I think that checking it via "maybeKeyCache.size() > 0" is a 
little bit misleading. I think that checking it via DatabaseDescriptor's 
property is better.

You can have a cache with positive capacity but that cache might hold 0 entries 
so in that case you would evaluate that "hey, we dont have cache" which is not 
technically true.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:43 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

I am not sure you can check size instead of capacity like that.

"capacity" is the maximum number of entries you can put into that cache.

"size" is the current number of entries in that cache.

So that "if (maybeKeyCache.getCapacity > 0)" basically checks if keyCache is 
enabled or not. I think that checking it via "maybeKeyCache.size() > 0" is a 
little bit misleading. I think that checking it via DatabaseDescriptor's 
property is better.

You can have a cache with positive capacity but that cache might hold 0 entries 
so in that case you would evaluate that "hey, we dont have cache" which is not 
technically true.


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

I am not sure you can check size instead of capacity like that.

"capacity" is the maximum number of entries you can put into that cache.

"size" is the current number of entries in that cache.

I think that it always holds size <= capacity. So that "if 
(maybeKeyCache.getCapacity > 0)" basically checks if keyCache is enabled or 
not. I think that checking it via "maybeKeyCache.size() > 0" is a little bit 
misleading. I think that checking it via DatabaseDescriptor's property is 
better.

You can have a cache with positive capacity but that cache might hold 0 entries 
so in that case you would evaluate that "hey, we dont have cache" which is not 
technically true.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:41 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

I am not sure you can check size instead of capacity like that.

"capacity" is the maximum number of entries you can put into that cache.

"size" is the current number of entries in that cache.

I think that it always holds size <= capacity. So that "if 
(maybeKeyCache.getCapacity > 0)" basically checks if keyCache is enabled or 
not. I think that checking it via "maybeKeyCache.size() > 0" is a little bit 
misleading. I think that checking it via DatabaseDescriptor's property is 
better.

You can have a cache with positive capacity but that cache might hold 0 entries 
so in that case you would evaluate that "hey, we dont have cache" which is not 
technically true.


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:36 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same (but WAY more performant)


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:35 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution its 
constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same.


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml

Just set a breakpoint in CacheService constructor and follow the execution its 
constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820442#comment-17820442
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 2/25/24 8:35 AM:


I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution in 
its constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same.


was (Author: smiklosovic):
I think it is <= 0 when key_cache_size: 0MiB in cassandra.yaml (so you 
effectively turn the key cache off)

Just set a breakpoint in CacheService constructor and follow the execution its 
constructor where keyCache = initKeyCache(); is called.

If we do not want to check "if (maybeKeyCache.getCapacity() > 0)" I think the 
same effect would be to check "if (DatabaseDescriptor.getKeyCacheSizeInMiB() > 
0)" and it would be same.

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: asprof_cass4.1.3__lock_20240216052912lock.html
>
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-02-23 Thread Dipietro Salvatore (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820219#comment-17820219
 ] 

Dipietro Salvatore edited comment on CASSANDRA-19429 at 2/23/24 11:06 PM:
--

My initial patch and test are based on this patch 
([https://github.com/salvatoredipietro/cassandra/commit/45f770dd8e06e490a4cd0f222e1d4a3060a45a38):]
{code:java}
>From 45f770dd8e06e490a4cd0f222e1d4a3060a45a38 Mon Sep 17 00:00:00 2001
From: Salvatore Dipietro 
Date: Wed, 21 Feb 2024 16:29:15 -0800
Subject: [PATCH] Remove lock contention generated by getCapacity function in
 SSTableReader

---
 CHANGES.txt   | 1 +
 .../org/apache/cassandra/io/sstable/format/SSTableReader.java | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 21084b8c721f..4ab8ccf21b3e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1.5
+ * Remove lock contention generated by getCapacity function in SSTableReader
 Merged from 4.0:
  * Remove bashisms for mx4j tool in cassandra-env.sh (CASSANDRA-19416)
 Merged from 3.11:
diff --git a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java 
b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
index f26cf65c93e0..3b703911984d 100644
--- a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
@@ -520,7 +520,7 @@ public static SSTableReader open(Descriptor descriptor,
 sstable.validate();
 
 if (sstable.getKeyCache() != null)
-logger.trace("key cache contains {}/{} keys", 
sstable.getKeyCache().size(), sstable.getKeyCache().getCapacity());
+logger.trace("key cache contains {} keys", 
sstable.getKeyCache().size());
 
 return sstable;
 }
@@ -717,7 +717,7 @@ public void setupOnline()
 // e.g. by BulkLoader, which does not initialize the cache.  As a 
kludge, we set up the cache
 // here when we know we're being wired into the rest of the server 
infrastructure.
 InstrumentingCache maybeKeyCache = 
CacheService.instance.keyCache;
-if (maybeKeyCache.getCapacity() > 0)
+if (maybeKeyCache.size() > 0)
 keyCache = maybeKeyCache;
 
 final ColumnFamilyStore cfs = 
Schema.instance.getColumnFamilyStoreInstance(metadata().id);
 {code}
However, it seem that the `keyCache` variable remains set to `null`.

Based on my experiments, I can see that `maybeKeyCache.getCapacity()` always 
returns a fixed long number for all its calls and it doesn't change over time.
Can someone please advice under which circumstances it is necessary to check 
that `maybeKeyCache.getCapacity() > 0` or if it is always possible to set it 
like `keyCache = CacheService.instance.keyCache;` ?

[~jlewandowski] I saw that you have recently worked on this part of the code 
for Cassandra 5 
([https://github.com/apache/cassandra/commit/cefa5b38142dd41a6eb8a90ac2ec78dc1e16d9b8#diff-c57d70c11a1f1373c8e0b486446973fc275ca8bf050218fa6c6af8baecdd3b6f).]
 Can you maybe provide some advise for this? Thanks


was (Author: JIRAUSER304377):
My initial patch and test are based on this patch 
(https://github.com/salvatoredipietro/cassandra/commit/45f770dd8e06e490a4cd0f222e1d4a3060a45a38):


{code:java}
>From 45f770dd8e06e490a4cd0f222e1d4a3060a45a38 Mon Sep 17 00:00:00 2001
From: Salvatore Dipietro 
Date: Wed, 21 Feb 2024 16:29:15 -0800
Subject: [PATCH] Remove lock contention generated by getCapacity function in
 SSTableReader

---
 CHANGES.txt   | 1 +
 .../org/apache/cassandra/io/sstable/format/SSTableReader.java | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 21084b8c721f..4ab8ccf21b3e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1.5
+ * Remove lock contention generated by getCapacity function in SSTableReader
 Merged from 4.0:
  * Remove bashisms for mx4j tool in cassandra-env.sh (CASSANDRA-19416)
 Merged from 3.11:
diff --git a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java 
b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
index f26cf65c93e0..3b703911984d 100644
--- a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
@@ -520,7 +520,7 @@ public static SSTableReader open(Descriptor descriptor,
 sstable.validate();
 
 if (sstable.getKeyCache() != null)
-logger.trace("key cache contains {}/{} keys", 
sstable.getKeyCache().size(), sstable.getKeyCache().getCapacity());
+logger.trace("key cache contains {} keys", 
sstable.getKeyCache().size());
 
 return sstable;
 }