[jira] [Commented] (CASSANDRA-16471) org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of all memory allocations

David Capwell (Jira) Fri, 26 Feb 2021 11:34:08 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291891#comment-17291891
 ]


David Capwell commented on CASSANDRA-16471:
-------------------------------------------

{quote}bq. The changes in CommitLog and UnfilteredSerializer, while not 
incorrect, look like they would induce new duplications and therefore new BB 
creations. Was that intended?
{quote}
Sadly yes.  Before we took the raw array, but in the direct case we no longer 
have the raw array, so I felt its better to allocate a ByteBuffer rather than 
copy into the array.  In the two cases they boil down to 
org.apache.cassandra.utils.FastByteOperations.UnsafeOperations#copy(java.nio.ByteBuffer,
 int, java.nio.ByteBuffer, int, int) which special cases the direct case (so 
better to stay direct).

Now, since we have been writing to the buffer we need to flip it, so one 
possible optimization (which didn't look a big deal) would be to return the 
underline BB flipped as the call paths no longer need them.

 
{quote} - Do we now need to make a call to {{FileUtils.clean(}} in 
{{DataOutputBuffer#close()}}, given we might have a direct buffer on 
hand?{quote}
Its mostly to make sure we free early as we know its not needed.  GC will free 
the memory but this can happen w/e, so felt its best to do as we know its dead 
memory.  clean also checks for direct so safe to call in non-direct case as 
well.

 
{quote} - We're now forcing the recycling of arbitrarily large buffer sizes by 
default, which I suppose makes it harder to know even per-thread how much 
memory (on or off-heap) we might consume. What if we just increased the default 
{{dob_max_recycle_bytes}} instead?{quote}
Yeah, this has the issue that memory isn't controlled, one alternative is to 
use BufferPool as a means to control better (though this isn't bound memory, 
but at least would be better in theory, though depends as I remember there was 
an issue where this class would have a lot of wasted memory).

We could up the default limit, but still have the issue we push the problem 
into GC, and all input is user generated so is unbound how much space is needed.

 

For the nits, +1 will change

 

> org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of 
> all memory allocations
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16471
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16471
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Caching
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>         Attachments: Screen Shot 2021-02-25 at 3.34.28 PM.png, Screen Shot 
> 2021-02-25 at 4.14.19 PM.png
>
>
> While running workflows to compare 3.0 with trunk we found that allocations 
> and GC are significantly higher for a write mostly workload (22% read, 3% 
> delete, 75% write); below is what we saw for a 2h run
> Allocations
> 30: 1.64TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40: 13.93k events
> When looking at the allocation output we saw the follow for memory allocations
> !https://issues.apache.org/jira/secure/attachment/13021238/Screen%20Shot%202021-02-25%20at%203.34.28%20PM.png!
> Here we see that org.apache.cassandra.io.util.DataOutputBuffer#expandToFit is 
> around 52% of the memory allocations.  When looking at this logic I see that 
> allocations are on-heap and constantly throw away the buffer (as a means to 
> allow GC to clean up).
> With the patch, allocations/gc are the following
> Allocations
> 30: 1.64TB
> 40 w/ patch: 1.77TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40 w/ patch: 8k events
> 40: 13.93k events
> With the patch only 0.8% allocations
> !https://issues.apache.org/jira/secure/attachment/13021239/Screen%20Shot%202021-02-25%20at%204.14.19%20PM.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-16471) org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of all memory allocations

Reply via email to