[ 
https://issues.apache.org/jira/browse/CASSANDRA-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269790#comment-17269790
 ] 

Yifan Cai commented on CASSANDRA-16339:
---------------------------------------

I ran several experiments that avoid running the garbage skipping step when 
there are over N shadow source candidates for the compaction task. With a small 
threshold value, flame graph shows the cpu time % of GarbageSkipper is lowered. 
(Below graph is obtained with threshold == 10)

!flamegraph_garbageskipper_with_threshold.png|width=684,height=350!

The full report can be found at 
[https://github.com/yifan-c/CASSANDRA-15581-COMPACTION-TEST/blob/main/CASSANDRA-16339/7019-Test:%20Perf%20Comparison%20%5BLCS%20-%20provide_overlapping_tombstones%20%2B%20experiments%5D.pdf]

The smaller the threshold, the read latency gets more stable. However, they are 
not as stable as in the steady state. In steady state, we can consider the 
threshold is 0. 

We can observe some latency reduction in different time spans. Now I think they 
are coming from the slower compaction. According to the charts of the report, 
when the compaction throughput is low, the read latency goes low too. The 
garbage skipping step is CPU heavy. When the step is active, it greatly lower 
the I/O ops from the compaction. So in the meantime, the system can serve read 
queries faster.

The gain we are expecting from the step is to reduce the file size and the disk 
usage. However, in all the runs, there is no noticeable difference after 
enabling for the workload (read : write : delete = 5 : 4 : 1).

Since the reward is too little comparing to the price paid. I think it is 
probably better to advice not using this feature, unless there is a suitable 
use case that have the a lot of deletes and few reads.

 

> LCS steady state load of table with vs. w/o GC performance test
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-16339
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16339
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Test/benchmark
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>         Attachments: flamegraph_garbageskipper_with_threshold.png, 
> flamegraph_grabageskipper.png
>
>
> The testing cluster should be pre-populated with ~200GB data in each node. 
> The baseline cluster has the table created with 
> {{provide_overlapping_tombstones}} disabled. The other cluster has the table 
> with {{provide_overlapping_tombstones == row}}. Compare the read, write and 
> compaction performance between those 2 clusters. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to