[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974574#comment-16974574
 ] 

Chandrasekhar Thumuluru commented on CASSANDRA-15397:
-----------------------------------------------------

[~benedict] —
 * Attached the patch file with my changes 
[^replace_intervaltree_with_intervallist.patch]. I used trunk for the patch.
 * Attached the unit test run results [^TESTS-TestSuites.xml.lz4] .
 * 
[Link|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
 to google doc with some details about the proposed algorithm.

Please review the changes when you get a chance and let me know your feedback.

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15397
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/SSTable
>            Reporter: Chandrasekhar Thumuluru
>            Assignee: Chandrasekhar Thumuluru
>            Priority: Low
>         Attachments: 95p_10000_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_20000_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_30000_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_10000_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_20000_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_30000_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_10000_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_20000_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_30000_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to