Re: Offsets and Range Queries
I assume it's because of iterators in read-time, which go over results do merging/reducing/collating results one-by-one that is not so well suited for jumping to arbitrary offsets, given the practically huge number of columns involved, right? No really, you can have a slice that starts in the middle of a row of 10 million columns just by using a start column. Having a slice operation that is constrained in size improves the overall throughout of the server and reduces the (jvm) GC churn in the server. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 16/11/2012, at 7:02 PM, Ravikumar Govindarajan ravikumar.govindara...@gmail.com wrote: Thanks Ed, for the clarifications Yes you are correct that the apps have to handle repeatable reads and not the databases themselves when using absolute offsets, but SQL databases do provide such an option at app's peril!!! Slices have a fixed size, this ensures that the the query does not execute for arbitrary lengths of time. I assume it's because of iterators in read-time, which go over results do merging/reducing/collating results one-by-one that is not so well suited for jumping to arbitrary offsets, given the practically huge number of columns involved, right? Did I understand it correctly? We are now faced with persisting the page with both first last-key for prev/next navigation. The problem gets quickly complex, when there we have to support multiple pages per user. I just wanted to know, if there any known work-arounds for this. -- Ravi On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo edlinuxg...@gmail.com wrote: There are several reasons. First there is no absolute offset. The rows are sorted by the data. If someone inserts new data between your query and this query the rows have changed. Unless you doing select queries inside a transaction with repeatable read and your database supports this the query you mention does not really have absolute offsets either. The results of the query can change between reads. In cassandra we do not execute large queries (that might results to temp tables or whatever) and allow you to page them. Slices have a fixed size, this ensures that the the query does not execute for arbitrary lengths of time. On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan ravikumar.govindara...@gmail.com wrote: Usually we do a SELECT * FROM ORDER BY LIMIT 26,25 for pagination purpose, but specifying offset is not available for range queries in cassandra. I always have to specify a start-key to achieve this. Are there reasons for choosing such an approach rather than providing an absolute offset? -- Ravi
Re: Offsets and Range Queries
There are several reasons. First there is no absolute offset. The rows are sorted by the data. If someone inserts new data between your query and this query the rows have changed. Unless you doing select queries inside a transaction with repeatable read and your database supports this the query you mention does not really have absolute offsets either. The results of the query can change between reads. In cassandra we do not execute large queries (that might results to temp tables or whatever) and allow you to page them. Slices have a fixed size, this ensures that the the query does not execute for arbitrary lengths of time. On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan ravikumar.govindara...@gmail.com wrote: Usually we do a SELECT * FROM ORDER BY LIMIT 26,25 for pagination purpose, but specifying offset is not available for range queries in cassandra. I always have to specify a start-key to achieve this. Are there reasons for choosing such an approach rather than providing an absolute offset? -- Ravi
Re: Offsets and Range Queries
Thanks Ed, for the clarifications Yes you are correct that the apps have to handle repeatable reads and not the databases themselves when using absolute offsets, but SQL databases do provide such an option at app's peril!!! Slices have a fixed size, this ensures that the the query does not execute for arbitrary lengths of time. I assume it's because of iterators in read-time, which go over results do merging/reducing/collating results one-by-one that is not so well suited for jumping to arbitrary offsets, given the practically huge number of columns involved, right? Did I understand it correctly? We are now faced with persisting the page with both first last-key for prev/next navigation. The problem gets quickly complex, when there we have to support multiple pages per user. I just wanted to know, if there any known work-arounds for this. -- Ravi On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo edlinuxg...@gmail.comwrote: There are several reasons. First there is no absolute offset. The rows are sorted by the data. If someone inserts new data between your query and this query the rows have changed. Unless you doing select queries inside a transaction with repeatable read and your database supports this the query you mention does not really have absolute offsets either. The results of the query can change between reads. In cassandra we do not execute large queries (that might results to temp tables or whatever) and allow you to page them. Slices have a fixed size, this ensures that the the query does not execute for arbitrary lengths of time. On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan ravikumar.govindara...@gmail.com wrote: Usually we do a SELECT * FROM ORDER BY LIMIT 26,25 for pagination purpose, but specifying offset is not available for range queries in cassandra. I always have to specify a start-key to achieve this. Are there reasons for choosing such an approach rather than providing an absolute offset? -- Ravi