[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

Benedict (JIRA) Fri, 27 Dec 2013 14:58:39 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857819#comment-13857819
 ]


Benedict commented on CASSANDRA-6271:
-------------------------------------

bq. Suspicious that Stack.compareTo ignores forward unless stacks are identical 
up to min depth.
This is because if they're identical to a given depth, but a different depth, 
then one of them is in a child and another in a parent branch. This is due to a 
possibly poor decision (see the TODO in Stack.find()) wherein the iterator 
position in a branch is overloaded to cover both 'being in a child', and 
'visiting the key that bounds the child', and if you're iterating 
forward/backwards this overloading is slightly different (see the 
Stack.compareTo comment). 
As a result the meaning of being in the same position at the min depth is 
different depending on iteration order. Ideally this wouldn't be the case, but 
it didn't seem worth rewriting it to resolve.

bq. Confused about ML.copyFrom – we reset to EMPTY trees, then add elements 
from other sources. Is there a better set of names we can use?
We only reset to EMPTY if we're inserting a new node either above or below 
those that already exist in the btree we're modifying. Otherwise we're 
'copying' any data (or unaffected child trees) from the tree we're modifying. I 
agree the nomenclature isn't superb, but I couldn't think of anything that was 
definitely clearer. Possibly 'originalNode'.

bq. Unclear when ML.update returns null.
ML.update() returns null when it's successfully applied the update. i.e. a 
non-null return is always the ML necessary to apply update() to to make 
progress on the update(), and by repeatedly calling update on the result you 
will eventually succeed. The maybe confusing condition at the bottom of 
ML.update() that returns null is to handle the case where we 
update(POSITIVE_INFINITY) to complete a batch of updates, which by definition 
can never find a place to insert in the tree so effectively results in copying 
any remaining data from the tree we're modifying and ascending to the root of 
the new tree.

bq. What is more efficient about ML.build vs calling ML.update(EMPTY_LEAF)?
Just a lot fewer unnecessary method calls copying empty ranges. Algorithmically 
they're equivalent. I haven't benchmarked them, actually, so it's possible it's 
not dramatically different. My original design of update() required build, as 
it couldn't deal with inserting a collection larger than the tree, but that's 
no longer a limitation.

bq. Is there a minimum number of keys in a branch or leaf? Are branches/leaves 
ever combined? Put another way: are all leaf nodes at the same tree depth?
Normal btree rules apply, so minimum keys for both branches and leaves (except 
root) is FAN_FACTOR / 2, and all leaf nodes are at the same depth (see the 
isWellFormed child type check ensuring never a mix of leaf/branch children)


> Replace SnapTree in AtomicSortedColumns
> ---------------------------------------
>
>                 Key: CASSANDRA-6271
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>         Attachments: oprate.svg
>
>
> On the write path a huge percentage of time is spent in GC (>50% in my tests, 
> if accounting for slow down due to parallel marking). SnapTrees are both GC 
> unfriendly due to their structure and also very expensive to keep around - 
> each column name in AtomicSortedColumns uses > 100 bytes on average 
> (excluding the actual ByteBuffer).
> I suggest using a sorted array; changes are supplied at-once, as opposed to 
> one at a time, and if < 10% of the keys in the array change (and data equal 
> to < 10% of the size of the key array) we simply overlay a new array of 
> changes only over the top. Otherwise we rewrite the array. This method should 
> ensure much less GC overhead, and also save approximately 80% of the current 
> memory overhead.
> TreeMap is similarly difficult object for the GC, and a related task might be 
> to remove it where not strictly necessary, even though we don't keep them 
> hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
> be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

Reply via email to