[ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700568#comment-13700568
 ] 

Sylvain Lebresne commented on CASSANDRA-5677:
---------------------------------------------

I agree, I also don't like the idea of adding a switch too much. We just have 
to decide if the performance problem is worth the risk. I'm personally slightly 
leaning towards yes (that it's probably worth the risk) because:
# I've seen at least 3 or 4 recent reports of someone having problem with this 
on the mailing list (sometimes in the context of collections, sometimes not). 
CASSANDRA-5466 is also almost surely a duplicate of this. I'd rather avoid 
having the "avoid collections, they are too inefficient" idea stick too much.
# The chances to break something for people that don't have range tombstones is 
fairly small (it's never 0 of course, but the main change is really the backing 
implementation for range tombstones). And if you do use range tombstones, the 
current performance is so bad as soon as you have a bit too much of them that 
the change is high it'll become a blocker for you.

But I do admit that this kind of change is a bit bigger than what I'd like to 
do in a 1.2.7 release in a perfect world.

In any case, I'll attached a rebased version for 1.2 soonish. Even if we decide 
against committing it, it can't hurt to have it around.
                
> Performance improvements of RangeTombstones/IntervalTree
> --------------------------------------------------------
>
>                 Key: CASSANDRA-5677
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Fabien Rousseau
>            Priority: Minor
>         Attachments: 5677-new-IntervalTree-implementation.patch
>
>
> Using massively range tombstones leads to bad response time (ie 100-500 
> ranges tombstones per row).
> After investigation, it seems that the culprit is how the DeletionInfo are 
> merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
> IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
> 100 instances of IntervalTree are created, the first one having one interval, 
> the second one 2 intervals,... the 100th one : 100 intervals...)
> It seems that once the IntervalTree is built, it is not possible to add a new 
> Interval. Idea is to change the implementation of the IntervalTree by another 
> one which support "insert interval".
> Attached is a proposed patch which :
>  - renames the IntervalTree implementation to IntervalTreeCentered (the 
> renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
>  - adds a new implementation IntervalTreeAvl (which is described here : 
> http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
> http://en.wikipedia.org/wiki/AVL_tree )
>  - adds a new interface IIntervalTree to abstract the implementation
>  - adds a new configuration option (interval_tree_provider) which allows to 
> choose between the two implementations (defaults to previous 
> IntervalTreeCentered)
>  - updates IntervalTreeTest unit tests to test both implementations
>  - creates a mini benchmark between the two implementations (tree creation, 
> point lookup, interval lookup)
>  - creates a mini benchmark between the two implementations when merging 
> DeletionInfo (which shows a big performance improvement when using 500 
> tombstones for a row)
> This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to