Re: clarification of token() in CQL3

Keith Freeman Tue, 06 Aug 2013 08:57:04 -0700

Ok, I get that, I'll have to find another way to sort out new rows.

Your description makes me think that if new rows are added during thepaging (i.e. between one select with token()'s and another), they mightshow up in the query results, right? (because the hash of the new rowkeys might fall sequentially after token(last_processed_row))


On 08/06/2013 08:18 AM, Richard Low wrote:

On 6 August 2013 15:12, Keith Freeman <[email protected]<mailto:[email protected]>> wrote:
    I've seen in several places the advice to use queries like to this
    page through lots of rows:
    select id from mytable where token(id) > token(last_id)
    But it's hard to find detailed information about how this works
    (at least that I can understand -- the description in the
    Cassandra manual is pretty brief).

    One thing I'd like to know is if new rows are always guaranteed to
    have token(new_id) > token(ids-of-all-previous-rows)?  E.g. if I
    have one process that adds rows to a table, and another that
    processes rows from the table, can the "processor" save the id of
    the last row processed and when he wakes up use:

        select * from mytable where token(id) > token(last_processed_id)


    to process only new rows?  Will this always work to get only new rows?
No, unfortunately not. The tokens are generated by the partitioner -they are the hash of the row key. New tokens could be anywhere in therange of tokens so you can't use token ordering to find new rows.
The query you suggest works to page through all the data in yourcolumn family. Rows will be returned regardless of when they wereadded (as long as they were added before the query started). Findingrows that have been added since a certain time is hard in Cassandrasince they are stored in token order. In general you have to readthrough all the data and work out from e.g. a date field if theyshould be treated as new.
Richard.

Re: clarification of token() in CQL3

Reply via email to