[
https://issues.apache.org/jira/browse/CASSANDRA-16471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291891#comment-17291891
]
David Capwell commented on CASSANDRA-16471:
-------------------------------------------
{quote}bq. The changes in CommitLog and UnfilteredSerializer, while not
incorrect, look like they would induce new duplications and therefore new BB
creations. Was that intended?
{quote}
Sadly yes. Before we took the raw array, but in the direct case we no longer
have the raw array, so I felt its better to allocate a ByteBuffer rather than
copy into the array. In the two cases they boil down to
org.apache.cassandra.utils.FastByteOperations.UnsafeOperations#copy(java.nio.ByteBuffer,
int, java.nio.ByteBuffer, int, int) which special cases the direct case (so
better to stay direct).
Now, since we have been writing to the buffer we need to flip it, so one
possible optimization (which didn't look a big deal) would be to return the
underline BB flipped as the call paths no longer need them.
{quote} - Do we now need to make a call to {{FileUtils.clean(}} in
{{DataOutputBuffer#close()}}, given we might have a direct buffer on
hand?{quote}
Its mostly to make sure we free early as we know its not needed. GC will free
the memory but this can happen w/e, so felt its best to do as we know its dead
memory. clean also checks for direct so safe to call in non-direct case as
well.
{quote} - We're now forcing the recycling of arbitrarily large buffer sizes by
default, which I suppose makes it harder to know even per-thread how much
memory (on or off-heap) we might consume. What if we just increased the default
{{dob_max_recycle_bytes}} instead?{quote}
Yeah, this has the issue that memory isn't controlled, one alternative is to
use BufferPool as a means to control better (though this isn't bound memory,
but at least would be better in theory, though depends as I remember there was
an issue where this class would have a lot of wasted memory).
We could up the default limit, but still have the issue we push the problem
into GC, and all input is user generated so is unbound how much space is needed.
For the nits, +1 will change
> org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of
> all memory allocations
> ---------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-16471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16471
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Caching
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Attachments: Screen Shot 2021-02-25 at 3.34.28 PM.png, Screen Shot
> 2021-02-25 at 4.14.19 PM.png
>
>
> While running workflows to compare 3.0 with trunk we found that allocations
> and GC are significantly higher for a write mostly workload (22% read, 3%
> delete, 75% write); below is what we saw for a 2h run
> Allocations
> 30: 1.64TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40: 13.93k events
> When looking at the allocation output we saw the follow for memory allocations
> !https://issues.apache.org/jira/secure/attachment/13021238/Screen%20Shot%202021-02-25%20at%203.34.28%20PM.png!
> Here we see that org.apache.cassandra.io.util.DataOutputBuffer#expandToFit is
> around 52% of the memory allocations. When looking at this logic I see that
> allocations are on-heap and constantly throw away the buffer (as a means to
> allow GC to clean up).
> With the patch, allocations/gc are the following
> Allocations
> 30: 1.64TB
> 40 w/ patch: 1.77TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40 w/ patch: 8k events
> 40: 13.93k events
> With the patch only 0.8% allocations
> !https://issues.apache.org/jira/secure/attachment/13021239/Screen%20Shot%202021-02-25%20at%204.14.19%20PM.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]