GitHub user nziebart opened a pull request:

    https://github.com/apache/cassandra/pull/170

    SliceQueryFilter can fetch more data than needed

    Slice queries currently fetch more data then necessary, when there is only 
one column not part of the primary key. Specifically, SliceQueryFilter does not 
stop reading until it has seen `limit + 1` live cells, even though in the case 
we can stop after seeing `limit` live cells.
    
    We have a use case where we use wide rows to implement versioning, by 
including a timestamp as part of the primary key. Every once in a while, we 
"garbage collect" old versions by deleting them. This results in a single 
column containing the latest version, followed by many tombstones.
    
    We use a `LIMIT 1` query to select the latest version (which is the first 
column in the row). However, because SliceQueryFilter does not stop until it 
has seen <limit + 1> live cells, we have to read all the tombstones following 
the single live cell. Furthermore, if these tombstones are covering data in 
other sstables, we have to read all the corresponding data when merging the 
sstable iterators. This can be a massive performance hit, and is unexpectedly 
caused by deleting data.
    
    This patch allows the `ColumnCounter` implementation to decide when it has 
seen enough cells. For counters that don't require grouping, we can stop 
immediately after finding the first cell.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nziebart/cassandra nziebart/limit-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cassandra/pull/170.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #170
    
----
commit 52d8a5c0e846b357f90e7e02b8903094447b9945
Author: Nathan Ziebart <[email protected]>
Date:   2017-10-06T23:48:03Z

    dont want to see next cell when not grouping

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to