[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167259#comment-15167259
 ] 

Benjamin Lerer commented on CASSANDRA-10707:
--------------------------------------------

{quote}In the parser, we have the {{GROUP BY}} before the {{ORDER BY}}. On the 
one side it's more consistent with {{SQL}} but on the other side, we do happen 
to order before we group. And we kind of know that post-query ordering is not 
really something we can do due to paging (we do happen to do it in a few case 
when paging is not involved out of backward compatibility but given the 
limitation, I don't think we'll want to ever extend that) so it does feel the 
{{ORDER BY}} should come before {{GROUP BY}}.{quote}

I do not think that the order in which we perform the operations and the CQL 
syntax should be linked. If I am not mistaken, we filter the data once they 
have been sorted, in most of the cases, but the restrictions appear before the 
{{ORDER BY}} clause. In my opinion we should stick to the {{SQL}} syntax for 
that. As most of the C* users have a {{SQL}} background it will prevent some 
confusions.

[~rustyrazorblade], [~pmcfadin] any opinion on the subject?

{quote}The {{forPagingByQueryPager}} and {{forGroupByInternalPaging}} methods 
are not very intuitive: it's unclear when just looking at {{DataLimits}} why 
they exist and why you should use them or not. I'm also not entirely sure we 
need the difference between the two. If I understand correctly, the problem is 
that when a replica ends a page in the middle of a group, the coordinator don't 
know if said replica has stopped because it reaches the end of a group which 
was the last one to return (in which case we're done), or if the replica 
stopped because of the page limit (but the group itself is not finished). But 
the coordinator should be able to figure that out by checking it's row counts: 
if after having consumed the page from the replica we've reached the group 
limit _or_ there was less row in the page than the page limit, then we're done, 
otherwise we should ask for more.  Granted, that does mean in the rare case 
where we've exhausted all data exactly on the page limit, we'll do an 
additional query (that will be empty), but pretty sure you have that problem in 
one form or another whatever you do.  Am I missing something here?{quote}

Let's take an example. If we have a table with one partition key {{a}}, 2 
clustering columns {{b}} and {{c}} and the following data:
{code}
 a | b | c
---+---+---
 1 | 0 | 0 
 1 | 0 | 1
 1 | 1 | 0
 1 | 2 | 1
 1 | 2 | 2
{code}

If we group on the column {{a}} and {{b}} with a page size of 2.
The 2 first rows will be returned in by the first internal page (rowCount = 2 
and groupCount = 0 as we have not reached the end of the group).
The second internal page will read 2 rows: (1, 1, 0) and (1, 2, 1). When (1, 2, 
1) will be reached, we will know that we have enough data to return to the 
user. As (1, 2, 1) is not part of a group that should be returned to the user 
it needs to be filtered out. If the row is filtered out on the replica the 
{{GroupByLimits}} of the coordinator will only have a rowCount = 1 and a 
groupCount = 1 and will assume that it has reached the end of the data (pager 
exhausted).

If we consider that the pager is not exhausted if groupCount = groupLimit - 1 
and rowCount < rowLimit, the user might end up requesting the next page for 
nothing. I also believe that it will break paging if the page size is equals to 
1 (that part I might have to check).  

{quote} I feel keeping/passing the {{lastPartitionKey}} and {{lastClustering}} 
in {{DataLimits.CQLGroupByLimits}} and {{GroupMaker}} is not as future 
proof/generic as the patch is otherwise trying to be. Typically, if we allow 
functions, the {{lastClustering}} won't really be a clustering. Granted, it'll 
still fit into a {{ByteBuffer[]}} but it'll be a bit of an abuse. So I would 
suggest adding {{GroupMaker.State}} concept whose implementation would be 
specific to the {{GroupMaker/GroupBySpecification}} but would be transparent to 
other code. Imo that would make some code cleaner (for instance, 
{{GroupMaker.GROUP_EVERYTHING}} wouldn't need to store anything, which makes 
sense).
.{quote}

We probably do not have the same design in mind. Even if we allow functions the 
{{lastClustering}} will always be a the last clustering. The function will only 
be applied within the {{GroupMaker}}.

{quote} I'm also confused as to why {{GroupMaker.NO_GROUPING}} even exists, we 
shouldn't ever create a {{GroupMaker}} in the first place if we're not 
aggregating at all.{quote}

When we build the {{ResultSet}} we need to handle 3 scenarios: normal query, 
aggregate query (no group by) and group by query. Having a {{GroupMaker}} 
implementation for normal queries simplify the code as the same algorythm can 
be used for the 3 scenarios. As the {{GroupMaker.NO_GROUPING}} is a singleton 
there is no real cost associated to it.

I am looking into the other problems. 


> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to