Re: Offsets and Range Queries

2012-11-16 Thread aaron morton
 I assume it's because of iterators in read-time, which go over results do 
 merging/reducing/collating results one-by-one that is not so well suited for 
 jumping to arbitrary offsets, given the practically huge number of columns 
 involved, right?
No really, you can have a slice that starts in the middle of a row of 10 
million columns just by using a start column. 

Having a slice operation that is constrained in size improves the overall 
throughout of the server and reduces the (jvm) GC churn in the server. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/11/2012, at 7:02 PM, Ravikumar Govindarajan 
ravikumar.govindara...@gmail.com wrote:

 Thanks Ed, for the clarifications
 
 Yes you are correct that the apps have to handle repeatable reads and not the 
 databases themselves when using absolute offsets, but SQL databases do 
 provide such an option at app's peril!!!
 
 Slices have a fixed size, this ensures that the the query does not execute 
 for arbitrary lengths of time.
 
 I assume it's because of iterators in read-time, which go over results do 
 merging/reducing/collating results one-by-one that is not so well suited for 
 jumping to arbitrary offsets, given the practically huge number of columns 
 involved, right? Did I understand it correctly?
 
 We are now faced with persisting the page with both first  last-key for 
 prev/next navigation. The problem gets quickly complex, when there we have to 
 support multiple pages per user. I just wanted to know, if there any known 
 work-arounds for this.
 
 --
 Ravi
 
 On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo edlinuxg...@gmail.com 
 wrote:
 There are several reasons. First there is no absolute offset. The
 rows are sorted by the data. If someone inserts new data between your
 query and this query the rows have changed.
 
 Unless you doing select queries inside a transaction with repeatable
 read and your database supports this the query you mention does not
 really have absolute offsets  either. The results of the query can
 change between reads.
 
 In cassandra we do not execute large queries (that might results to
 temp tables or whatever) and allow you to page them. Slices have a
 fixed size, this ensures that the the query does not execute for
 arbitrary lengths of time.
 
 
 On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
 ravikumar.govindara...@gmail.com wrote:
  Usually we do a SELECT * FROM  ORDER BY  LIMIT 26,25 for pagination
  purpose, but specifying offset is not available for range queries in
  cassandra.
 
  I always have to specify a start-key to achieve this. Are there reasons for
  choosing such an approach rather than providing an absolute offset?
 
  --
  Ravi
 



Re: Offsets and Range Queries

2012-11-15 Thread Edward Capriolo
There are several reasons. First there is no absolute offset. The
rows are sorted by the data. If someone inserts new data between your
query and this query the rows have changed.

Unless you doing select queries inside a transaction with repeatable
read and your database supports this the query you mention does not
really have absolute offsets  either. The results of the query can
change between reads.

In cassandra we do not execute large queries (that might results to
temp tables or whatever) and allow you to page them. Slices have a
fixed size, this ensures that the the query does not execute for
arbitrary lengths of time.


On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
ravikumar.govindara...@gmail.com wrote:
 Usually we do a SELECT * FROM  ORDER BY  LIMIT 26,25 for pagination
 purpose, but specifying offset is not available for range queries in
 cassandra.

 I always have to specify a start-key to achieve this. Are there reasons for
 choosing such an approach rather than providing an absolute offset?

 --
 Ravi


Re: Offsets and Range Queries

2012-11-15 Thread Ravikumar Govindarajan
Thanks Ed, for the clarifications

Yes you are correct that the apps have to handle repeatable reads and not
the databases themselves when using absolute offsets, but SQL databases do
provide such an option at app's peril!!!

Slices have a fixed size, this ensures that the the query does not
execute for arbitrary lengths of time.

I assume it's because of iterators in read-time, which go over results do
merging/reducing/collating results one-by-one that is not so well suited
for jumping to arbitrary offsets, given the practically huge number of
columns involved, right? Did I understand it correctly?

We are now faced with persisting the page with both first  last-key for
prev/next navigation. The problem gets quickly complex, when there we have
to support multiple pages per user. I just wanted to know, if there any
known work-arounds for this.

--
Ravi

On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 There are several reasons. First there is no absolute offset. The
 rows are sorted by the data. If someone inserts new data between your
 query and this query the rows have changed.

 Unless you doing select queries inside a transaction with repeatable
 read and your database supports this the query you mention does not
 really have absolute offsets  either. The results of the query can
 change between reads.

 In cassandra we do not execute large queries (that might results to
 temp tables or whatever) and allow you to page them. Slices have a
 fixed size, this ensures that the the query does not execute for
 arbitrary lengths of time.


 On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
 ravikumar.govindara...@gmail.com wrote:
  Usually we do a SELECT * FROM  ORDER BY  LIMIT 26,25 for
 pagination
  purpose, but specifying offset is not available for range queries in
  cassandra.
 
  I always have to specify a start-key to achieve this. Are there reasons
 for
  choosing such an approach rather than providing an absolute offset?
 
  --
  Ravi