[ 
https://issues.apache.org/jira/browse/CASSANDRA-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024528#comment-15024528
 ] 

Branimir Lambov commented on CASSANDRA-9991:
--------------------------------------------

Thank you for the patch. You are right to question the worth of saving some 
work at the expense of code. I'd actually prefer to go the other way and 
simplify the code more.

It's not obvious why you want to have {{remove}} separated from 
{{removeFromLeft}} -- I can see that it is necessary to avoid copying / 
reorganizing when the key is not in the tree, but that could be stated in a 
comment to save some reader the time to figure it out. The same goes for the 
need for {{swap}} to walk the whole tree again, a comment would help a lot.

I'd extract [the predecessor loop in 
{{remove}}|https://github.com/apache/cassandra/compare/trunk...blambov:haaawk/9991#diff-fd77b44ecefe258afe8334dd2b4b63b7R42]
 to a method in {{BTree}}. Finding last element (using {{findByIndex(tree, 
size-1)}}) is also done in a few other places, you can point them to that 
method as well.

The ordering in 
[{{removeFromLeaf}}|https://github.com/apache/cassandra/compare/trunk...blambov:haaawk/9991#diff-fd77b44ecefe258afe8334dd2b4b63b7R82]
 is odd: We first look for the easy case, then switch to the most difficult 
one, than go back to an easier one (also repeating a check). Why not check if 
left is large enough first, if not check if right is large enough, if not do 
the join?

{{removeLeft / removeRight}} are very unreadable. Perhaps extract 
{{copyRemoving/copyInserting}} methods and call them?

[The newNextNode size 
calculation|https://github.com/apache/cassandra/compare/trunk...blambov:haaawk/9991#diff-fd77b44ecefe258afe8334dd2b4b63b7R211]
 in {{removeLeft/Right}} looks suspect; it only works because {{FAN_FACTOR}} is 
a power of 2. I think it would be much clearer if you enforced the invariant by 
using something like {{(nextKeyEnd + 1) | 1}}.

[{{copyBranchWithKeyAndChildRemoved}}|https://github.com/apache/cassandra/compare/trunk...blambov:haaawk/9991#diff-fd77b44ecefe258afe8334dd2b4b63b7R279]
 is only used with (and seems to only make sense with) {{keyIndex == 
childIndex}}, you can remove the other parameter.

{{System.arraycopy}} can deal with length 0, remove the unnecessary checks [in 
{{copyKeys/Children}}|https://github.com/apache/cassandra/compare/trunk...blambov:haaawk/9991#diff-fd77b44ecefe258afe8334dd2b4b63b7R330].

The test coverage is great, but I'd like to have a more exhaustive random test 
as well -- something like removing everything in random order from a large 
{{BTree}}. I also noticed {{isWellFormed}} does not check branch node size, 
could you add that too?

Going further, the {{BTree}} infrastructure includes methods that deal in 
element indices. You can use {{findIndex}} instead of the code in {{remove}} 
and then do the removal walk using indices, which will replace a few of the 
binary searches in {{removeFromLeaf}} with integer ones. Since you are then 
certain that you need to remove an element, it would be easy to integrate the 
predecessor swap into the removal pass.

I'm curious if you tried the recursive deletion with bottom-up reorganization 
alternative? That looks like it would be simpler if we are okay with a few 
extra allocations and copying, which we may be as it shouldn't happen too 
often. What do you think?

> Implement efficient btree removal
> ---------------------------------
>
>                 Key: CASSANDRA-9991
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9991
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Benedict
>              Labels: patch
>             Fix For: 3.x
>
>         Attachments: trunk-9991.txt
>
>
> Currently removal is implemented as a reconstruction by filtering and 
> iterator over the original btree. This could be much more efficient, editing 
> just the necessary nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to