[
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chandrasekhar Thumuluru updated CASSANDRA-15397:
------------------------------------------------
Attachment: 90p_1million_sstables_with_1000_searches.png
Mean_1million_sstables_with_1000_searches.png
95p_1million_sstables_with_1000_searches.png
99p_1million_sstables_with_1000_searches.png
90p_750k_sstables_with_1000_searches.png
95p_750k_sstables_with_1000_searches.png
99p_750k_sstables_with_1000_searches.png
Mean_750k_sstables_with_1000_searches.png
90p_500k_sstables_with_1000_searches.png
95p_500k_sstables_with_1000_searches.png
99p_500k_sstables_with_1000_searches.png
Mean_500k_sstables_with_1000_searches.png
90p_250k_sstables_with_1000_searches.png
95p_250k_sstables_with_1000_searches.png
99p_250k_sstables_with_1000_searches.png
Mean_250k_sstables_with_1000_searches.png
90p_100k_sstables_with_1000_searches.png
95p_100k_sstables_with_1000_searches.png
99p_100k_sstables_with_1000_searches.png
Mean_100k_sstables_with_1000_searches.png
> IntervalTree performance comparison with Linear Walk and Binary Search based
> Elimination.
> ------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/SSTable
> Reporter: Chandrasekhar Thumuluru
> Assignee: Chandrasekhar Thumuluru
> Priority: Low
> Labels: pull-request-available
> Attachments: 90p_100k_sstables_with_1000_searches.png,
> 90p_1million_sstables_with_1000_searches.png,
> 90p_250k_sstables_with_1000_searches.png,
> 90p_500k_sstables_with_1000_searches.png,
> 90p_750k_sstables_with_1000_searches.png,
> 95p_10000_SSTable_with_5000_Searches.png,
> 95p_100k_sstables_with_1000_searches.png,
> 95p_15000_SSTable_with_5000_Searches.png,
> 95p_1million_sstables_with_1000_searches.png,
> 95p_20000_SSTable_with_5000_Searches.png,
> 95p_25000_SSTable_with_5000_Searches.png,
> 95p_250k_sstables_with_1000_searches.png,
> 95p_30000_SSTable_with_5000_Searches.png,
> 95p_5000_SSTable_with_5000_Searches.png,
> 95p_500k_sstables_with_1000_searches.png,
> 95p_750k_sstables_with_1000_searches.png,
> 99p_10000_SSTable_with_5000_Searches.png,
> 99p_100k_sstables_with_1000_searches.png,
> 99p_15000_SSTable_with_5000_Searches.png,
> 99p_1million_sstables_with_1000_searches.png,
> 99p_20000_SSTable_with_5000_Searches.png,
> 99p_25000_SSTable_with_5000_Searches.png,
> 99p_250k_sstables_with_1000_searches.png,
> 99p_30000_SSTable_with_5000_Searches.png,
> 99p_5000_SSTable_with_5000_Searches.png,
> 99p_500k_sstables_with_1000_searches.png,
> 99p_750k_sstables_with_1000_searches.png, IntervalList.java,
> IntervalListWithElimination.java, IntervalTreeSimplified.java,
> Mean_10000_SSTable_with_5000_Searches.png,
> Mean_100k_sstables_with_1000_searches.png,
> Mean_15000_SSTable_with_5000_Searches.png,
> Mean_1million_sstables_with_1000_searches.png,
> Mean_20000_SSTable_with_5000_Searches.png,
> Mean_25000_SSTable_with_5000_Searches.png,
> Mean_250k_sstables_with_1000_searches.png,
> Mean_30000_SSTable_with_5000_Searches.png,
> Mean_5000_SSTable_with_5000_Searches.png,
> Mean_500k_sstables_with_1000_searches.png,
> Mean_750k_sstables_with_1000_searches.png, TESTS-TestSuites.xml.lz4,
> replace_intervaltree_with_intervallist.patch
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with
> search interval. In Cassandra, IntervalTrees are not mutated. They are
> recreated each time a mutation is required. This can be an issue during
> repairs. In fact we noticed such issues during repair.
> Since lists are cache friendly compared to linked lists and trees, I decided
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start
> and end points of search interval).
> Based on the tests I ran, I noticed Binary Search based elimination almost
> always performs similar to IntervalTree or out performs IntervalTree based
> search. The cost of IntervalTree construction is also substantial and
> produces lot of garbage during repairs.
> I ran the tests using random intervals to build the tree/lists and another
> randomly generated search interval with 5000 iterations. I'm attaching all
> the relevant graphs. The x-axis in the graphs is the search interval
> coverage. 10p means the search interval covered 10% of the intervals. The
> y-axis is the time the search took in nanos.
> PS:
> # For the purpose of test, I simplified the IntervalTree by removing the data
> portion of the interval. Modified the template version (Java generics) to a
> specialized version.
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]