[
https://issues.apache.org/jira/browse/CASSANDRA-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935321#comment-17935321
]
Jon Haddad edited comment on CASSANDRA-20428 at 3/13/25 6:47 PM:
-----------------------------------------------------------------
It's awesome to see enthusiasm to get this done and I'm doubly excited to learn
about the different efforts already underway to address it. Thank you all for
chiming in here!
[~nitsanw] do you have a branch you can share? Did you do your work on top of
CASSANDRA-20092? I can't say I know the code path well enough to guess if
you're only able to support BIG and not BTI is related to reading from the
index.
Given so much work has already been done, proven out, and presumably works with
JDK 17, it sounds like it makes the most sense to be more surgical to address
the allocation issue using the above proposals. I believe the Arenas are
likely to be the best long term fix, but it should probably be part of a
larger, architectural discussion where we can flesh out the optimal way to
approach the intersection of allocation, data file format, leverage SIMD, and
how to better manage object lifetimes.
I'm going to spend some time looking more closely at what arenas can do for us.
At the very least, they offer us the ability to completely avoid memory issues
associated with a leaked buffer, they keep allocations completely in userspace,
and are far more CPU cache friendly then individual allocations. I doubt
there's a better long term approach than allocating all request scoped objects
in a user-space off heap stack allocater that's not subject to GC.
was (Author: rustyrazorblade):
It's awesome to see enthusiasm to get this done and I'm doubly excited to learn
about the different efforts already underway to address it. Thank you all for
chiming in here!
[~nitsanw] do you have a branch you can share? Did you do your work on top of
CASSANDRA-20092? I can't say I know the code path well enough to guess if
you're only able to support BIG and not BTI is related to reading from the
index.
Given so much work has already been done, proven out, and presumably works with
JDK 17, it sounds like it makes the most sense to be more surgical to address
the allocation issue using the above proposals. I believe the Arenas are
likely to be the best long term fix, but it should probably be part of a
larger, architectural discussion where we can flesh out the optimal way to
approach the intersection of allocation, data file format, leverage SIMD, and
how to better manage object lifetimes.
I'm going to spend some time looking more closely at what arenas can give us.
At the very least, they offer us the ability to completely avoid memory issues
associated with a leaked buffer, they keep allocations completely in userspace,
and are far more CPU cache friendly then individual allocations. I doubt
there's a better long term approach than allocating all request scoped objects
in a user-space stack allocator not subject to GC.
> Eliminate byte array allocation in ByteArrayAccessor.read
> ---------------------------------------------------------
>
> Key: CASSANDRA-20428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20428
> Project: Apache Cassandra
> Issue Type: Improvement
> Reporter: Jon Haddad
> Priority: Normal
> Attachments: allocation-reverse.html,
> image-2025-03-11-11-05-55-378.png
>
>
> During compaction we allocate a new byte[] in ByteArrayAccessor.read. This
> is one of the hottest paths in the codebase, hit during writes, compaction,
> creating tables, and possibly others. In my performance tests using default
> compaction settings of 64MB I see this responsible for 40% of allocations.
> This is largely what drives GC pause frequency and duration. If we are able
> to eliminate the O(N) allocations performed here, this might be one of the
> best optimizations we could do for the number of things it touches.
> Allocation profile attached.
> !image-2025-03-11-11-05-55-378.png|width=514,height=269!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]