[jira] [Commented] (CASSANDRA-15389) Minimize BTree iterator allocations

Blake Eggleston (Jira) Wed, 04 Dec 2019 10:02:33 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988054#comment-16988054
 ]


Blake Eggleston commented on CASSANDRA-15389:
---------------------------------------------

I can see how that applies to hasComplex, no problem. What I’m having trouble 
seeing is how this would apply to hasComplexDeletion. 

Given a set of ComplexColumnData instances, I don’t think there’s any ordering 
criteria we can use to search for complex deletion is there? That would prevent 
a direct adaptation of `firstComplexIdx` since it's just looking for the first 
index of something, and I don’t think we can avoid iteration. That would also 
preclude removing the stop condition since there’s no point continuing once 
we’ve found a complex deletion.

Knowing the first complex index doesn’t help us without 1) implementing slice 
functionality in accumulate, 2) iterating through the complex cells with 
BTree#findByIndex, which would be much slower than just walking the tree.

Implementing slice (or findAndAccumulate) would avoid reverse iteration, but I 
don’t think that would make the api or implementation easier to understand.

> Minimize BTree iterator allocations
> -----------------------------------
>
>                 Key: CASSANDRA-15389
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15389) Minimize BTree iterator allocations

Reply via email to