[ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700093#comment-13700093
 ] 

Fabien Rousseau commented on CASSANDRA-5677:
--------------------------------------------

I had a quick look at the patch (I'll take more time to review it next week), 
and being faster is really great. Keeping only the latest range tombstone (In 
fact, in our use case, we often overwrite range tombstones) was something I 
also add in mind but kept it for later optimization : I wrongly assumed that 
they were kept for a real reason (like repair for example).

I definitely think your approach is better and performance numbers confirms it.
I really think that this patch should be available for 1.2 (either as a patch 
to apply, or directly in 1.2.X).
If in 1.2.X, an option could be added in cassandra.yaml file to switch of 
implementations (just rapidly checked and seen that ondisk format seems 
compatible...)
In 1.2.X : make the current implementation the default to avoid introducing too 
many changes (but users having performance trouble could still switch after 
doing some tests. Also note that it should be possible to log something if more 
than X range tombstones are read and that it is advised to switch of 
implementation...)
In 2.0 : make the RangeTombstoneList the default
(It is just raw ideas...)

I can try to rebase your patch for 1.2 next week if you're interested...
                
> Performance improvements of RangeTombstones/IntervalTree
> --------------------------------------------------------
>
>                 Key: CASSANDRA-5677
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Fabien Rousseau
>            Priority: Minor
>         Attachments: 5677-new-IntervalTree-implementation.patch
>
>
> Using massively range tombstones leads to bad response time (ie 100-500 
> ranges tombstones per row).
> After investigation, it seems that the culprit is how the DeletionInfo are 
> merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
> IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
> 100 instances of IntervalTree are created, the first one having one interval, 
> the second one 2 intervals,... the 100th one : 100 intervals...)
> It seems that once the IntervalTree is built, it is not possible to add a new 
> Interval. Idea is to change the implementation of the IntervalTree by another 
> one which support "insert interval".
> Attached is a proposed patch which :
>  - renames the IntervalTree implementation to IntervalTreeCentered (the 
> renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
>  - adds a new implementation IntervalTreeAvl (which is described here : 
> http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
> http://en.wikipedia.org/wiki/AVL_tree )
>  - adds a new interface IIntervalTree to abstract the implementation
>  - adds a new configuration option (interval_tree_provider) which allows to 
> choose between the two implementations (defaults to previous 
> IntervalTreeCentered)
>  - updates IntervalTreeTest unit tests to test both implementations
>  - creates a mini benchmark between the two implementations (tree creation, 
> point lookup, interval lookup)
>  - creates a mini benchmark between the two implementations when merging 
> DeletionInfo (which shows a big performance improvement when using 500 
> tombstones for a row)
> This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to