[ https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857819#comment-13857819 ]
Benedict commented on CASSANDRA-6271: ------------------------------------- bq. Suspicious that Stack.compareTo ignores forward unless stacks are identical up to min depth. This is because if they're identical to a given depth, but a different depth, then one of them is in a child and another in a parent branch. This is due to a possibly poor decision (see the TODO in Stack.find()) wherein the iterator position in a branch is overloaded to cover both 'being in a child', and 'visiting the key that bounds the child', and if you're iterating forward/backwards this overloading is slightly different (see the Stack.compareTo comment). As a result the meaning of being in the same position at the min depth is different depending on iteration order. Ideally this wouldn't be the case, but it didn't seem worth rewriting it to resolve. bq. Confused about ML.copyFrom – we reset to EMPTY trees, then add elements from other sources. Is there a better set of names we can use? We only reset to EMPTY if we're inserting a new node either above or below those that already exist in the btree we're modifying. Otherwise we're 'copying' any data (or unaffected child trees) from the tree we're modifying. I agree the nomenclature isn't superb, but I couldn't think of anything that was definitely clearer. Possibly 'originalNode'. bq. Unclear when ML.update returns null. ML.update() returns null when it's successfully applied the update. i.e. a non-null return is always the ML necessary to apply update() to to make progress on the update(), and by repeatedly calling update on the result you will eventually succeed. The maybe confusing condition at the bottom of ML.update() that returns null is to handle the case where we update(POSITIVE_INFINITY) to complete a batch of updates, which by definition can never find a place to insert in the tree so effectively results in copying any remaining data from the tree we're modifying and ascending to the root of the new tree. bq. What is more efficient about ML.build vs calling ML.update(EMPTY_LEAF)? Just a lot fewer unnecessary method calls copying empty ranges. Algorithmically they're equivalent. I haven't benchmarked them, actually, so it's possible it's not dramatically different. My original design of update() required build, as it couldn't deal with inserting a collection larger than the tree, but that's no longer a limitation. bq. Is there a minimum number of keys in a branch or leaf? Are branches/leaves ever combined? Put another way: are all leaf nodes at the same tree depth? Normal btree rules apply, so minimum keys for both branches and leaves (except root) is FAN_FACTOR / 2, and all leaf nodes are at the same depth (see the isWellFormed child type check ensuring never a mix of leaf/branch children) > Replace SnapTree in AtomicSortedColumns > --------------------------------------- > > Key: CASSANDRA-6271 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6271 > Project: Cassandra > Issue Type: Improvement > Reporter: Benedict > Assignee: Benedict > Labels: performance > Attachments: oprate.svg > > > On the write path a huge percentage of time is spent in GC (>50% in my tests, > if accounting for slow down due to parallel marking). SnapTrees are both GC > unfriendly due to their structure and also very expensive to keep around - > each column name in AtomicSortedColumns uses > 100 bytes on average > (excluding the actual ByteBuffer). > I suggest using a sorted array; changes are supplied at-once, as opposed to > one at a time, and if < 10% of the keys in the array change (and data equal > to < 10% of the size of the key array) we simply overlay a new array of > changes only over the top. Otherwise we rewrite the array. This method should > ensure much less GC overhead, and also save approximately 80% of the current > memory overhead. > TreeMap is similarly difficult object for the GC, and a related task might be > to remove it where not strictly necessary, even though we don't keep them > hanging around for long. TreeMapBackedSortedColumns, for instance, seems to > be used in a lot of places where we could simply sort the columns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)