[ 
https://issues.apache.org/jira/browse/CASSANDRA-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935226#comment-17935226
 ] 

Branimir Lambov commented on CASSANDRA-20428:
---------------------------------------------

Hi Nitsan, great to see you here again. The cursor compaction code in 
DataStax's branch is indeed based on some of your ideas and is [already 
public|https://github.com/datastax/cassandra/commit/e8f609913ee17cc950196a58077166a34d661e3d]
 (the whole datastax/cassandra repository is). See [the included Markdown 
documentation|https://github.com/datastax/cassandra/blob/e8f609913ee17cc950196a58077166a34d661e3d/src/java/org/apache/cassandra/io/sstable/compaction/cursors.md]
 for details on what it does and how.

In the context of this ticket, this patch is not enough as it still constructs 
rows in memory for writing, but it can be used directly or as a blueprint to 
solve the reading and merging part. One of the things we realized during 
development is that there is still a limit on the impact that this kind of 
rewrite can have, and if we need to further improve performance we have to look 
at some ways of adding more metadata to be able to copy sections of the file 
without modification. There are currently multiple barriers to that, most 
immediately the fact that we write some fields as offsets from an sstable-level 
minimum which changes during compactions.

> Eliminate byte array allocation in ByteArrayAccessor.read
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-20428
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20428
>             Project: Apache Cassandra
>          Issue Type: Improvement
>            Reporter: Jon Haddad
>            Priority: Normal
>         Attachments: allocation-reverse.html, 
> image-2025-03-11-11-05-55-378.png
>
>
> During compaction we allocate a new byte[] in ByteArrayAccessor.read.  This 
> is one of the hottest paths in the codebase, hit during writes, compaction, 
> creating tables, and possibly others.  In my performance tests using default 
> compaction settings of 64MB I see this responsible for 40% of allocations.  
> This is largely what drives GC pause frequency and duration.  If we are able 
> to eliminate the O(N) allocations performed here, this might be one of the 
> best optimizations we could do for the number of things it touches.
> Allocation profile attached.
> !image-2025-03-11-11-05-55-378.png|width=514,height=269!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to