Read http://www.datastax.com/dev/blog/cql3-table-support-in-hadoop-pig-and-hive
And look at http://fossies.org/dox/apache-cassandra-1.2.8-src/CqlPagingRecordReader_8java_source.html - Jon On Fri, Aug 16, 2013 at 12:08 PM, Keith Freeman <[email protected]> wrote: > I've run into the same problem, surprised nobody's responded to you. Any > time someone asks "how do I page through all the rows of a table in CQL3?", > the standard answer is token() and limit. But as you point out, this > method will often miss some data from wide rows. > > Maybe a Cassandra expert will chime in if we're wrong. > > Your suggestion is possible if you know how to find the previous value of > 'name' field (and are willing to filter out repeated rows), but wouldn't > that be difficult/impossible with some keys? So then, is there a way to do > paging queries that get ALL of the rows, even in wide rows? > > > > On 08/13/2013 02:46 PM, Jan Algermissen wrote: > >> HI, >> >> ok, so I found token() [1], and that it is an option for paging through >> randomly partitioned data. >> >> I take it that combining token() and LIMIT is the CQL3 idiom for paging >> (set aside the fact that one shouldn't raelly want to page and use C*) >> >> Now, when I page through a CF with wide rows, limitting each 'page' to, >> for example, 100 I end up in situations where not all 'sub'rows that have >> the same result for token() are returned because LIMIT chops off the result >> after 100 'sub'rows, not neccessarily at the boundary to the next wide row. >> >> Obvious ... but inconvenient. >> >> The solution would be to throw away the last token returned (because it's >> wide row could have been chopped off) and do the next query with the token >> before. >> >> So instead of doing >> >> SELECT * FROM users WHERE token(name) > >> token(last-name-of-prev-**result) >> LIMIT 100; >> >> I'd be doing >> >> SELECT * FROM users WHERE token(name) > >> token(one-befoe-the-last-name-**of-prev-result) LIMIT 100; >> >> >> Question: Is that what I have to do or is there a way to make token() and >> limit work together to return complete wide rows? >> >> >> Jan >> >> >> >> [1] token() and how it relates to paging is actually quite hard to grasp >> from the docs. >> > >
