[jira] [Commented] (CASSANDRA-15389) Minimize BTree iterator allocations

Benedict Elliott Smith (Jira) Tue, 03 Dec 2019 17:06:51 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987402#comment-16987402
 ]


Benedict Elliott Smith commented on CASSANDRA-15389:
----------------------------------------------------

Thanks.  I'll try to find some time in the near future to undertake a full 
review.

bq. BTreeRow I had a bug in my hasComplexDeletion re-implementation that made 
reverse iteration / stop condition seem unnecessary. We actually do need both 
for hasComplexDeletion to work properly. Otherwise we’d only detect complex 
deletion if it’s on the final complex column.

So, I confess to not having looked closely enough to notice your bug, but (I 
think) nor to have been mislead by it.  I may have been unclear in my 
suggestion, since it was very terse.  The {{firstComplexIdx}} calculation is 
used in other places to avoid having to perform reverse iteration, since it 
gives the _lowest_ index in the btree in which any complex column data occurs.  
Since complex data sorts after simple, this gives the whole range of indices on 
which complex data occurss.  We would need to implement a 
{{Row}}/{{ColumnData}} variant of the feature we have previously only used in 
{{Columns}}, but it should map directly, and might permit us to avoid 
implementing this extra functionality.

Does that make sense, or am I still missing something?



> Minimize BTree iterator allocations
> -----------------------------------
>
>                 Key: CASSANDRA-15389
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15389) Minimize BTree iterator allocations

Reply via email to