[ 
https://issues.apache.org/jira/browse/CASSANDRA-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369940#comment-14369940
 ] 

Benedict commented on CASSANDRA-8988:
-------------------------------------

I've pushed an update with your first two comments addressed.

As to the potential regression, let's discuss it a little first. I must admit 
I'm finding it a little difficult to model mentally - interval trees aren't one 
of my fortes. I'm not wed to the change I've made, but it seems to me it is an 
optimisation. AFAICT there should be up to 2.lg(N) invocations of these 
unidirectional paths, so by removing the preliminary test we remove O(lg(N)) 
comparisons. Now only two of these comparisons can fail, the others are wasted. 
They each permit saving one final scan. This scan was previously O(M) in 
#intersections, but is now O(lg(M)) down to your suggestion, so we're trading 
O(lg(N)) and getting O(lg(M)).

This result also gives that these unidirectional paths are now O(lg(N).lg(M)) 
in cost, when previously they were O(lg(N).M), so the lg(N) and lg(M) costs we 
are trading are algorithmically not very important. But any downside risk 
occurs only when M is large, and we've reduced the algorithmically complexity 
here to save us more than we lose. _Typically_ N should be greater than M, 
though, so we're saving here too - except in situations where the entire 
contents are essentially overlapped, in which case most of the M will occur in 
a node that spans a large enough range that it answers all queries without 
hitting one of the unidirectional branches (at least, the main body of M will 
not be present in one of the branches). 

So the only situation we are risking is a very weird border case where there is 
a huge M node high in the tree that is reached by directly crossing the search 
interval boundary. Every other situation we're improving, I think?


> Optimise IntervalTree
> ---------------------
>
>                 Key: CASSANDRA-8988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8988
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Trivial
>             Fix For: 2.1.4
>
>         Attachments: 8988.txt
>
>
> We perform a lot of unnecessary comparisons in 
> IntervalTree.IntervalNode.searchInternal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to