[jira] [Commented] (CASSANDRA-6722) cross-partition ordering should have warning or be disallowed when paging

Sylvain Lebresne (JIRA) Wed, 19 Feb 2014 07:37:35 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905561#comment-13905561
 ]


Sylvain Lebresne commented on CASSANDRA-6722:
---------------------------------------------

Good catch. We can't indeed properly do post-query reordering in general if we 
page. Note that currently, there is 2 cases where we do post-query reordering:
# with a IN on the partition key and an ORDER BY (the example above).
# if there is an IN on the last clustering column of a compact table.

The 2nd case is actually a bit weird. The reason we order post-query is so the 
resultset follows the order of the IN in the query, i.e. if you have the table 
above but COMPACT and do
{noformat}
SELECT value FROM paging_test WHERE id=1 AND value IN ('b', 'a', 'c')
{noformat}
you will get [ 'b', 'a', 'c' ] in that order as a result set. So far, why not, 
but there is 2 problems in practice:
* we only do that for compact tables. For non-compact ones, we just return 
results in clustering order (we don't do the post-query ordering). That 
inconsistency is an historical accident.
* this actually take precedence over ORDER BY, which is completely broken. I.e. 
even if you add 'ORDER BY value' in the query above, the results will not be 
properly ordered.
* this is even more broken than that: CASSANDRA-6701.
This case is a mess. So because it's a problem in the context of this ticket 
and because there is no reason to guarantee any special ordering of the result 
set if there is no ORDER BY, I suggest we just remove the behavior for compact 
storage (it was an implementation detail so far) to make it in line with the 
non-compact case, thus avoiding us to have to deal with it here.

Back to the more interesting case of the example in the description: a IN on 
the partition key with ORDER BY. In that case, we could almost support paging 
properly: the reason it's broken is that the pager queries partitions of the IN 
one after the other.  But the pager could, in theory, page over each partition 
simultaneously, querying them all little by little and doing a merge sort.

It is however not all that easy in practice. If you query a full page of each 
partition and there is many partitions in the IN, you'll load tons of data in 
memory, defeating in large parts the goal of paging. If you instead query less 
than the page size of each partition, you now may need to re-query some of the 
partitions depending on what the merge sort yield on those first pages. Not 
only would that require serious refactor of the current code to properly 
handle, but it's rather unclear how efficient this would really be in general. 
I'm not sure it's really worth it in the end. 

But in any case, such solution is way out of scope for 2.0 (and probably even 
for 2.1 at this point). So we need a quick solution for now and it probably 
mean that we should throw an IRE if the query requires post-query paging and 
paging is on.

Taking a step back, I wonder if allowing post-query reordering by default was a 
good idea, and if the same way we have ALLOW FILTERING, we shouldn't have 
required an ALLOW IN-MEMORY REORDERING for such queries. Of course, it's harder 
to change that now, but maybe we could still do it by adding such flag, 
deprecate queries that don't use it by logging a warning like we've done in 
CASSANDRA-6649 for 1 or 2 versions, and forbid it completely afterwards?


> cross-partition ordering should have warning or be disallowed when paging
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6722
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6722
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Russ Hatch
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 2.0.6
>
>         Attachments: 6722.txt
>
>
> consider this schema/data/query:
> {noformat}
> CREATE TABLE paging_test (
>     id int,
>     value text,
>     PRIMARY KEY (id, value)
> ) WITH CLUSTERING ORDER BY (value ASC)
>             |id|value|
>             |1 |a    |
>             |2 |b    |
>             |1 |c    |
>             |2 |d    | 
>             |1 |e    | 
>             |2 |f    | 
>             |1 |g    | 
>             |2 |h    |
>             |1 |i    |
>             |2 |j    |
> select * from paging_test where id in (1,2) order by value asc;
> {noformat}
> When paging the above query I get the sorted results from id=1 first, then 
> the sorted results from id=2 after that. I was testing this because I was 
> curious if the paging system could somehow globally sort the results but it 
> makes sense that we can't do that, since that would require all results to be 
> collated up front.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6722) cross-partition ordering should have warning or be disallowed when paging

Reply via email to