[ 
https://issues.apache.org/jira/browse/CASSANDRA-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935262#comment-17935262
 ] 

Nitsan Wakart edited comment on CASSANDRA-20428 at 3/13/25 4:06 PM:
--------------------------------------------------------------------

IIUC the DS trunk contains work that is not on the Cassandra trunk, and that is 
one such patches.

> One of the things we realized during development is that there is still a 
> limit on the impact that this kind of rewrite can have

Yes. I think the limit of this approach (or indeed any compaction optimisation 
within the current format) is the inability to copy partitions/rows which are 
known to be unchanged verbatim. The wins are still considerable though before 
we hit that wall. And format changes are possible as a follow up.

There are also setup/teardown costs to compaction that are not addressed in 
this work. You can observe those by running a compaction benchmark with 0 rows. 
I don't have good numbers for those at the moment, but in non-representative 
setup they show up as ~20-30% of the time spent. These can be tackled 
separately.

I hope that having a dedicated focused compaction implementation can help 
enable further optimizations in this important area. One of the challenges for 
this work was/is trying to deduce which parts are related to the compaction 
process. Hopefully this will end up simplifying the problem as the 
implementation specialises.


was (Author: nitsanw):
IIUC the DS trunk contains work that is not on the Cassandra trunk, and that is 
one such patches.

> One of the things we realized during development is that there is still a 
> limit on the impact that this kind of rewrite can have

Yes. I think the limit of this approach (or indeed any compaction optimisation 
within the current format) is the inability to copy partitions/rows which are 
known to be unchanged verbatim. The wins are still considerable though before 
we hit that wall. And format changes are possible as a follow up.

There are also setup/teardown costs to compaction that are not addressed in 
this work. You can observe those by running a compaction benchmark with 0 rows. 
I don't have good numbers for those at the moment, but in non-representative 
setup they show up as ~20-30% of the time spent. These can be tackled 
separately.

> Eliminate byte array allocation in ByteArrayAccessor.read
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-20428
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20428
>             Project: Apache Cassandra
>          Issue Type: Improvement
>            Reporter: Jon Haddad
>            Priority: Normal
>         Attachments: allocation-reverse.html, 
> image-2025-03-11-11-05-55-378.png
>
>
> During compaction we allocate a new byte[] in ByteArrayAccessor.read.  This 
> is one of the hottest paths in the codebase, hit during writes, compaction, 
> creating tables, and possibly others.  In my performance tests using default 
> compaction settings of 64MB I see this responsible for 40% of allocations.  
> This is largely what drives GC pause frequency and duration.  If we are able 
> to eliminate the O(N) allocations performed here, this might be one of the 
> best optimizations we could do for the number of things it touches.
> Allocation profile attached.
> !image-2025-03-11-11-05-55-378.png|width=514,height=269!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to