[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2020-03-09 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055108#comment-17055108
 ] 

Chandrasekhar Thumuluru edited comment on CASSANDRA-15397 at 3/9/20, 3:50 PM:
--

{quote}
I'm not sure if assuming long will be a good idea.
{quote}
I meant in the context of generics and not about the performance.  I'll make 
necessary changes, compare it again and post the results. 


was (Author: cthumuluru):
{quote}
I'm not sure if assuming long will be a good idea.
{quote}
I meant in the context of generics and about the performance.  

I'll make necessary changes, compare it again and post the results. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 90p_100k_sstables_with_1000_searches.png, 
> 90p_1million_sstables_with_1000_searches.png, 
> 90p_250k_sstables_with_1000_searches.png, 
> 90p_500k_sstables_with_1000_searches.png, 
> 90p_750k_sstables_with_1000_searches.png, 
> 95p_1_SSTable_with_5000_Searches.png, 
> 95p_100k_sstables_with_1000_searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_1million_sstables_with_1000_searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_250k_sstables_with_1000_searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 95p_500k_sstables_with_1000_searches.png, 
> 95p_750k_sstables_with_1000_searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_100k_sstables_with_1000_searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_1million_sstables_with_1000_searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_250k_sstables_with_1000_searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, 
> 99p_500k_sstables_with_1000_searches.png, 
> 99p_750k_sstables_with_1000_searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_100k_sstables_with_1000_searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_1million_sstables_with_1000_searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_250k_sstables_with_1000_searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, 
> Mean_500k_sstables_with_1000_searches.png, 
> Mean_750k_sstables_with_1000_searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2020-03-09 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17054782#comment-17054782
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15397 at 3/9/20, 9:18 AM:
-

bq. I couldn't do so since the code uses a generic that's comparable

The {{IntervalTree}} is used in precisely one place in the codebase, so it 
would be possible to hardcode to this use case for improved performance.

bq.  I'm not sure if assuming long will be a good idea

I would be very surprised if it is not significantly faster.  Particularly in 
tests that correctly account for memory latency (i.e. ensure the data is not 
entirely held in CPU cache before the test begins).


was (Author: benedict):
bq. I couldn't do so since the code uses a generic that's comparable

The {{IntervalTree}} is used in precisely one place in the codebase, so it 
would be possible to hardcode to this use case for improved performance.

bq.  I'm not sure if assuming long will be a good idea

I would be very surprised if it is not significantly faster.

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 90p_100k_sstables_with_1000_searches.png, 
> 90p_1million_sstables_with_1000_searches.png, 
> 90p_250k_sstables_with_1000_searches.png, 
> 90p_500k_sstables_with_1000_searches.png, 
> 90p_750k_sstables_with_1000_searches.png, 
> 95p_1_SSTable_with_5000_Searches.png, 
> 95p_100k_sstables_with_1000_searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_1million_sstables_with_1000_searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_250k_sstables_with_1000_searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 95p_500k_sstables_with_1000_searches.png, 
> 95p_750k_sstables_with_1000_searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_100k_sstables_with_1000_searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_1million_sstables_with_1000_searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_250k_sstables_with_1000_searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, 
> 99p_500k_sstables_with_1000_searches.png, 
> 99p_750k_sstables_with_1000_searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_100k_sstables_with_1000_searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_1million_sstables_with_1000_searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_250k_sstables_with_1000_searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, 
> Mean_500k_sstables_with_1000_searches.png, 
> Mean_750k_sstables_with_1000_searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the 

[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-20 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001037#comment-17001037
 ] 

Chandrasekhar Thumuluru edited comment on CASSANDRA-15397 at 12/20/19 5:50 PM:
---

[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 


was (Author: cthumuluru):
[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code reference in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree 

[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-20 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001037#comment-17001037
 ] 

Chandrasekhar Thumuluru edited comment on CASSANDRA-15397 at 12/20/19 5:50 PM:
---

[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even if the 
assumption turns out to be wrong. 


was (Author: cthumuluru):
[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree 

[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-19 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17000505#comment-17000505
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15397 at 12/20/19 12:21 AM:
---

The code looks pretty simple and clean, though I will need to look more 
carefully before we consider merging. We would want to rename the class, since 
it's no longer a tree, and we would probably want to avoid the extra work of 
going via streams (which looks like it would allocate O(\n) extra data).

As it stands, I would probably want to see more performance comparisons, 
particularly out to more outlandish numbers of sstables (at least one million), 
and also for excessively skewed distributions. Since the benefits shown in your 
graphs are modest in absolute terms, and the potential algorithmic harms of a 
linear scan could have significant downside risk. So we need to be sure we have 
properly established what that might be.

There are some potential simple improvements to this approach that would make 
it more desirable: instead of maintaining two lists of interval objects, we 
could maintain four {{long[]}}; two matches pairs of {{long[]}} each 
representing the two current sorted lists. One would be used for binary search, 
the other for the linear scan. This should dramatically improve the constant 
factors, only needing to support post-filtering for e.g. {{RandomPartitioner}}, 
as we would only be able to filter on a prefix for those tokens > 8 bytes.

Although this approach would require slightly more involved modifications, and 
a bit more verification, the win should be much more pronounced.  Is this 
something you'd be willing to try?


was (Author: benedict):
The code looks pretty simple and clean, though I will need to look more 
carefully before we consider merging. We would want to rename the class, since 
it's no longer a tree, and we would probably want to avoid the extra work of 
going via streams (which looks like it would allocate O(n) extra data).

As it stands, I would probably want to see more performance comparisons, 
particularly out to more outlandish numbers of sstables (at least one million), 
and also for excessively skewed distributions. Since the benefits shown in your 
graphs are modest in absolute terms, and the potential algorithmic harms of a 
linear scan could have significant downside risk. So we need to be sure we have 
properly established what that might be.

There are some potential simple improvements to this approach that would make 
it more desirable: instead of maintaining two lists of interval objects, we 
could maintain four {{long[]}}; two matches pairs of {{long[]}} each 
representing the two current sorted lists. One would be used for binary search, 
the other for the linear scan. This should dramatically improve the constant 
factors, only needing to support post-filtering for e.g. {{RandomPartitioner}}, 
as we would only be able to filter on a prefix for those tokens > 8 bytes.

Although this approach would require slightly more involved modifications, and 
a bit more verification, the win should be much more pronounced.  Is this 
something you'd be willing to try?

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap