[
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010900#comment-17010900
]
Benedict Elliott Smith commented on CASSANDRA-15389:
----------------------------------------------------
Thanks, the patch is looking really good. Just some minor suggestions to
consider.
AbstractRow
* If we introduce a {{BiFunction}} variant of {{apply}}, in which we can
supply the second argument, then we can go garbage-free for e.g. {{digest}} by
calling {{apply(ColumnData::digest, hasher)}}
BTreeRow
* {{hasComplexDeletion}} does not need the {{if (isSimple)}} branch anymore, I
think?
* {{hasComplexDeletion}} seems to be testing the wrong sentinel return value
from {{accumulate}}?
* Does {{HAS_INVALID_DELETIONS}} need to test {{cd != null}}?
Row
* {{collectStats}} could delegate to a new {{ComplexColumnData.accumulate}},
and pass itself as the parameter (though we would need to test {{!(cd
instanceof ComplexColumnData)}} instead of {{cd.column.isSimple()}}
* This might mean we can remove the {{hasCells()}} test, as we will be
garbage-free and can let {{accumulate}} make the right decision for us
BTree
* Having been playing with a lot of BTree methods recently, I personally find
it aids clarity to have totally separate bits of code implementing behavior for
leaves and branches, with a guard at-or-near the start of the method switching
behaviour. It could invoke a {{methodLeaf}} variant, or do it inline, but I
think it could make it somewhat easier to follow the logic in {{accumulate}}.
It’s only a small amount of code to duplicate, but the reader doesn’t have to
try and play out the switches while following the flow.
* I also find that when iterating branches, it is most clear to simply do one
key/child pair per loop, and a final (or initial) unrolled copy of a child
visit. Again, a small amount of duplication but I think easier for a reader to
digest (at least, I found this helped when reading my own methods). This
_should_ also lead to fewer executed instructions, and perhaps better branch
predictor behaviour (hard to say, many of them might be well predicted anyway).
nits:
* {{AbstractRow.validateData}} should probably use method reference for
clarity, i.e. {{apply(ColumnData::validate)}}
* {{BTreeRow.HAS_INVALID_DELETIONS}} can be declared using lambda syntax
* {{BTreeRow}} unused imports
* {{BTreeRow:345}} unused {{ALWAYS_TRUE}}
* {{Row}} unused imports
* {{Rows}} unused {{addColumn}}
* {{UnfilteredSerializer}} unused import
* {{SerializationHeader}} unused imports
* {{BTree}} unused import
> Minimize BTree iterator allocations
> -----------------------------------
>
> Key: CASSANDRA-15389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
> Project: Cassandra
> Issue Type: Sub-task
> Components: Local/Compaction
> Reporter: Blake Eggleston
> Assignee: Blake Eggleston
> Priority: Normal
> Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
> • using Row#apply where appropriate on frequently called methods
> (Row#digest, Row#validateData
> • adding BTree accumulate method. Like the apply method, this method walks
> the btree with a function that takes and returns a long argument, this
> eliminates iterator allocations without adding helper object allocations
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize,
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats,
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the
> allocation of helper objects in places where apply was used previously^[1]^.
> • Create map of columns in SerializationHeader, this lets us avoid
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>
> [1] the memory test does measure memory allocated by lambdas capturing objects
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]