Dominic Letz created CASSANDRA-8546:
---------------------------------------

             Summary: RangeTombstoneList becoming bottleneck on tombstone heavy 
tasks
                 Key: CASSANDRA-8546
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8546
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
         Environment: 2.0.11 / 2.1
            Reporter: Dominic Letz
            Priority: Critical
             Fix For: 2.0.12
         Attachments: tombstone_test.tgz

I would like to propose a change of the data structure used in the 
RangeTombstoneList to store and insert tombstone ranges to something with at 
least O(log N) insert in the middle and at near O(1) and start AND end. Here is 
why:

When having tombstone heavy work-loads the current implementation of 
RangeTombstoneList becomes a bottleneck with slice queries.

Scanning the number of tombstones up to the default maximum (100k) can take up 
to 3 minutes of how addInternal() scales on insertion of middle and start 
elements.

The attached test shows that with 50k deletes from both sides of a range.

INSERT 1...110000
flush()
DELETE 1...50000
DELETE 110000...60000

While one direction performs ok (~400ms on my notebook):
{code}
SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
{code}

The other direction underperforms (~7seconds on my notebook)
{code}
SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to