[
https://issues.apache.org/jira/browse/CASSANDRA-16471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291885#comment-17291885
]
Caleb Rackliffe commented on CASSANDRA-16471:
---------------------------------------------
Finished a first pass of review. The performance results are very nice,
although I'm not exactly sure _how much of the improvement was due to allowing
the reuse of arbitrarily large buffers and how much was moving the scratch
buffer off-heap_.
*possibly important things:*
- The changes in {{CommitLog}} and {{UnfilteredSerializer}}, while not
incorrect, look like they would induce new duplications and therefore new BB
creations. Was that intended?
- Do we now need to make a call to {{FileUtils.clean(}} in
{{DataOutputBuffer#close()}}, given we might have a direct buffer on hand?
- We're now forcing the recycling of arbitrarily large buffer sizes by default,
which I suppose makes it harder to know even per-thread how much memory (on or
off-heap) we might consume. What if we just increased the default
{{dob_max_recycle_bytes}} instead?
*nits:*
- unused import (although not because of this patch) {{import
org.apache.cassandra.utils.memory.MemoryUtil;}} one line 32 of
{{BufferedDataOutputStreamPlus}}
- unused import {{import
org.apache.cassandra.config.CassandraRelevantProperties;}} on line 29 of
{{DataOutputBuffer}}
- just personal taste, but an alignment like this might be nice for the body of
allocate in {{DataOutputBuffer}}:
{noformat}
protected ByteBuffer allocate(int size)
{
return ALLOCATION_TYPE == AllocationType.DIRECT ?
ByteBuffer.allocateDirect(size)
: ByteBuffer.allocate(size);
}
{noformat}
> org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of
> all memory allocations
> ---------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-16471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16471
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Caching
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Attachments: Screen Shot 2021-02-25 at 3.34.28 PM.png, Screen Shot
> 2021-02-25 at 4.14.19 PM.png
>
>
> While running workflows to compare 3.0 with trunk we found that allocations
> and GC are significantly higher for a write mostly workload (22% read, 3%
> delete, 75% write); below is what we saw for a 2h run
> Allocations
> 30: 1.64TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40: 13.93k events
> When looking at the allocation output we saw the follow for memory allocations
> !https://issues.apache.org/jira/secure/attachment/13021238/Screen%20Shot%202021-02-25%20at%203.34.28%20PM.png!
> Here we see that org.apache.cassandra.io.util.DataOutputBuffer#expandToFit is
> around 52% of the memory allocations. When looking at this logic I see that
> allocations are on-heap and constantly throw away the buffer (as a means to
> allow GC to clean up).
> With the patch, allocations/gc are the following
> Allocations
> 30: 1.64TB
> 40 w/ patch: 1.77TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40 w/ patch: 8k events
> 40: 13.93k events
> With the patch only 0.8% allocations
> !https://issues.apache.org/jira/secure/attachment/13021239/Screen%20Shot%202021-02-25%20at%204.14.19%20PM.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]