[
https://issues.apache.org/jira/browse/CASSANDRA-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935262#comment-17935262
]
Nitsan Wakart edited comment on CASSANDRA-20428 at 3/13/25 4:06 PM:
--------------------------------------------------------------------
IIUC the DS trunk contains work that is not on the Cassandra trunk, and that is
one such patches.
> One of the things we realized during development is that there is still a
> limit on the impact that this kind of rewrite can have
Yes. I think the limit of this approach (or indeed any compaction optimisation
within the current format) is the inability to copy partitions/rows which are
known to be unchanged verbatim. The wins are still considerable though before
we hit that wall. And format changes are possible as a follow up.
There are also setup/teardown costs to compaction that are not addressed in
this work. You can observe those by running a compaction benchmark with 0 rows.
I don't have good numbers for those at the moment, but in non-representative
setup they show up as ~20-30% of the time spent. These can be tackled
separately.
I hope that having a dedicated focused compaction implementation can help
enable further optimizations in this important area. One of the challenges for
this work was/is trying to deduce which parts are related to the compaction
process. Hopefully this will end up simplifying the problem as the
implementation specialises.
was (Author: nitsanw):
IIUC the DS trunk contains work that is not on the Cassandra trunk, and that is
one such patches.
> One of the things we realized during development is that there is still a
> limit on the impact that this kind of rewrite can have
Yes. I think the limit of this approach (or indeed any compaction optimisation
within the current format) is the inability to copy partitions/rows which are
known to be unchanged verbatim. The wins are still considerable though before
we hit that wall. And format changes are possible as a follow up.
There are also setup/teardown costs to compaction that are not addressed in
this work. You can observe those by running a compaction benchmark with 0 rows.
I don't have good numbers for those at the moment, but in non-representative
setup they show up as ~20-30% of the time spent. These can be tackled
separately.
> Eliminate byte array allocation in ByteArrayAccessor.read
> ---------------------------------------------------------
>
> Key: CASSANDRA-20428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20428
> Project: Apache Cassandra
> Issue Type: Improvement
> Reporter: Jon Haddad
> Priority: Normal
> Attachments: allocation-reverse.html,
> image-2025-03-11-11-05-55-378.png
>
>
> During compaction we allocate a new byte[] in ByteArrayAccessor.read. This
> is one of the hottest paths in the codebase, hit during writes, compaction,
> creating tables, and possibly others. In my performance tests using default
> compaction settings of 64MB I see this responsible for 40% of allocations.
> This is largely what drives GC pause frequency and duration. If we are able
> to eliminate the O(N) allocations performed here, this might be one of the
> best optimizations we could do for the number of things it touches.
> Allocation profile attached.
> !image-2025-03-11-11-05-55-378.png|width=514,height=269!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]