[ 
https://issues.apache.org/jira/browse/CASSANDRA-16471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291885#comment-17291885
 ] 

Caleb Rackliffe commented on CASSANDRA-16471:
---------------------------------------------

Finished a first pass of review. The performance results are very nice, 
although I'm not exactly sure _how much of the improvement was due to allowing 
the reuse of arbitrarily large buffers and how much was moving the scratch 
buffer off-heap_.

*possibly important things:*

- The changes in {{CommitLog}} and {{UnfilteredSerializer}}, while not 
incorrect, look like they would induce new duplications and therefore new BB 
creations. Was that intended?
- Do we now need to make a call to {{FileUtils.clean(}} in 
{{DataOutputBuffer#close()}}, given we might have a direct buffer on hand?
- We're now forcing the recycling of arbitrarily large buffer sizes by default, 
which I suppose makes it harder to know even per-thread how much memory (on or 
off-heap) we might consume. What if we just increased the default 
{{dob_max_recycle_bytes}} instead?

*nits:*

- unused import (although not because of this patch) {{import 
org.apache.cassandra.utils.memory.MemoryUtil;}} one line 32 of 
{{BufferedDataOutputStreamPlus}}
- unused import {{import 
org.apache.cassandra.config.CassandraRelevantProperties;}} on line 29 of 
{{DataOutputBuffer}}
- just personal taste, but an alignment like this might be nice for the body of 
allocate in {{DataOutputBuffer}}:

{noformat}
protected ByteBuffer allocate(int size)
{
    return ALLOCATION_TYPE == AllocationType.DIRECT ? 
ByteBuffer.allocateDirect(size) 
                                                    : ByteBuffer.allocate(size);
}
{noformat}

> org.apache.cassandra.io.util.DataOutputBuffer#scratchBuffer is around 50% of 
> all memory allocations
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16471
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16471
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Caching
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>         Attachments: Screen Shot 2021-02-25 at 3.34.28 PM.png, Screen Shot 
> 2021-02-25 at 4.14.19 PM.png
>
>
> While running workflows to compare 3.0 with trunk we found that allocations 
> and GC are significantly higher for a write mostly workload (22% read, 3% 
> delete, 75% write); below is what we saw for a 2h run
> Allocations
> 30: 1.64TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40: 13.93k events
> When looking at the allocation output we saw the follow for memory allocations
> !https://issues.apache.org/jira/secure/attachment/13021238/Screen%20Shot%202021-02-25%20at%203.34.28%20PM.png!
> Here we see that org.apache.cassandra.io.util.DataOutputBuffer#expandToFit is 
> around 52% of the memory allocations.  When looking at this logic I see that 
> allocations are on-heap and constantly throw away the buffer (as a means to 
> allow GC to clean up).
> With the patch, allocations/gc are the following
> Allocations
> 30: 1.64TB
> 40 w/ patch: 1.77TB
> 40: 2.99TB
> GC Events
> 30: 7.39k events
> 40 w/ patch: 8k events
> 40: 13.93k events
> With the patch only 0.8% allocations
> !https://issues.apache.org/jira/secure/attachment/13021239/Screen%20Shot%202021-02-25%20at%204.14.19%20PM.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to