[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163608#comment-15163608
 ] 

Sylvain Lebresne commented on CASSANDRA-10707:
----------------------------------------------

The first point that needs discussing is that this patch changes the inter-node 
protocol and this without bumping the protocol version, which is something 
we've never done.

On the one side bumping the protocol version is something we've kept for major 
releases and so far planned not to do before 4.0. But on the other side, not 
being able to add anything to the inter-node messages limits a lot new features 
(like this one) and doesn't make sense in a tick-tock world.

We could accept this kind of change without bumping the version on the argument 
that it's only "additive" to the serialization format and that it breaks 
nothing if the new feature is not used, and we'd document that "you shall not 
use GROUP BY until all nodes in the cluster are updated to a version that 
supports it". But it's not something to treat lightly because if a user don't 
respect that rule, older node will fail deserialization of the message, which 
will drop the connection, which can impact other on-ongoing queries, which is 
not good.

Alternatively, we could simply bump the protocol version, forgetting the "not 
before 4.0" rule. The advantage is that if we do so we can have the sending 
node (instead of the receiving one) throw if a GROUP BY is used but not all 
nodes are up to date, which is cleaner and avoid the connection dropped 
problem.  

So something we need to decide first, and maybe we need a separate ticket to 
have that discussion.

Anyway, outside of this question, some general remarks on the patch:
* In the parser, we have the {{GROUP BY}} before the {{ORDER BY}}. On the one 
side it's more consistent with {{SQL}} but on the other side, we do happen to 
order before we group. And we kind of know that post-query ordering is not 
really something we can do due to paging (we do happen to do it in a few case 
when paging is not involved out of backward compatibility but given the 
limitation, I don't think we'll want to ever extend that) so it does feel the 
{{ORDER BY}} should come before {{GROUP BY}}.
* The {{forPagingByQueryPager}} and {{forGroupByInternalPaging}} methods are 
not very intuitive: it's unclear when just looking at {{DataLimits}} why they 
exist and why you should use them or not. I'm also not entirely sure we need 
the difference between the two. If I understand correctly, the problem is that 
when a replica ends a page in the middle of a group, the coordinator don't know 
if said replica has stopped because it reaches the end of a group which was the 
last one to return (in which case we're done), or if the replica stopped 
because of the page limit (but the group itself is not finished). But the 
coordinator should be able to figure that out by checking it's row counts: if 
after having consumed the page from the replica we've reached the group limit 
_or_ there was less row in the page than the page limit, then we're done, 
otherwise we should ask for more.  Granted, that does mean in the rare case 
where we've exhausted all data exactly on the page limit, we'll do an 
additional query (that will be empty), but pretty sure you have that problem in 
one form or another whatever you do.  Am I missing something here?
* Not sure how I feel about {{GroupSelector}}. It's a bit confusing as of this 
patch and while I get that its for allowing functions later, it might not be as 
generic as we may need. For instance, if we ever want to support function on 
multiple columns (which would be possible, assuming those are consecutive PK 
columns), this won't work, at least not as implemented. I think I'd slightly 
prefer removing {{GroupSelector}} for now and just rely on extending 
{{GroupBySpecification}} for future extension (typically we'll add a 
{{withFunctionGroupBySpec}} which would have all liberty as to how it is 
implemented).  Doing so also mean we can simplify the probably common and only 
currently supported case as we just need to keep the length of the prefix we 
group by rather than a list of (dumb) selector.
* I feel keeping/passing the {{lastPartitionKey}} and {{lastClustering}} in 
{{DataLimits.CQLGroupByLimits}} and {{GroupMaker}} is not as future 
proof/generic as the patch is otherwise trying to be. Typically, if we allow 
functions, the {{lastClustering}} won't really be a clustering. Granted, it'll 
still fit into a {{ByteBuffer[]}} but it'll be a bit of an abuse. So I would 
suggest adding {{GroupMaker.State}} concept whose implementation would be 
specific to the {{GroupMaker/GroupBySpecification}} but would be transparent to 
other code. Imo that would make some code cleaner (for instance, 
{{GroupMaker.GROUP_EVERYTHING}} wouldn't need to store anything, which makes 
sense).
* {{GroupBySpeccification.NO_GROUPING}} feels confusing because it really means 
the query is not an aggregate, while the naming makes it sound like it has no 
{{GROUP BY}}.  Similarly, the {{isGrouping}} method really mean 
{{hasAggregate}}. In fact, I think it would be more clear to just have no 
{{GroupBySpecification}} in that case, keeping it {{null}} in SelectStatement 
(and in {{Selection}}). Having {{groupBySpec != null}} to mean no aggregate is 
imo more clear than {{!groupBySpec.isGrouping()}}. I'm also confused as to why 
{{GroupMaker.NO_GROUPING}} even exists, we shouldn't ever create a 
{{GroupMaker}} in the first place if we're not aggregating at all.
* In {{DataLimits}}, the naming of {{noGroupByLimits}} (and 
{{distinctNoGroupByLimits}}) is confusing (I mean, the both create a 
{{CQLGroupByLimits}}!). I think we could remove then and just pass 
{{cqlRowLimit}} directly as first argument in {{SelectStatement.getDataLimits}}.
* Related to the previous point, some phrasing is a tad confusing, like a 
comment saying {{GroupByPartitionIterator for queries without Group By}} in 
{{GroupByQueryPager}}. I think that's because {{GroupSpecification}}, 
{{GroupByLimits}} and {{GroupByQueryPager}} classes are really about handling 
aggregates in general, not query having an explicit {{GROUP BY}}.  So maybe 
renaming to {{AggregationSpecification}}, {{AggregationLimits}} and 
{{AggregationQueryPager}} would be more clear?
* The logic in {{SelectStatement.RawStatement.getGroupBySpecification}} breaks 
(doesn't throw a "proper" exception) if someone write {{SELECT * FROM t GROUP 
BY a, a}} where {{a}} is the sole PK column (the query is stupid, but it should 
throw a proper exception). More generally, I feel that the logic don't make it 
too intuitive that we're indeed checking  proper ordering of the column. As a 
nit, instead of copying both partition and clustering columns in 
{{CFMetaData.primaryKeyColumns}}, I'd use {{Iterables.concat}} instead (and use 
an iterator in getGroupBySpecification, which would make it even more apparent 
than not checking {{hasNext}} is a bad idea).
* Having {{CQLGroupByLimits}} inherit {{estimateTotalResults}} from 
{{CQLLimits}} is theoretically wrong as we should return an estimate of groups. 
Now, properly doing that is hard and so for now I'm fine reusing the one of 
{{CQLLimits}} for simplicity but we should call this out clearly by overriding 
the method (having it just call {{super.estimateTotalResults()}}) with a 
comment explaining we know it's off and should ideally be fixed someday. In 
fact, I wonder if having {{CQLGroupByLimits}} inherit {{CQLLimits}} is a good 
idea: they work sufficiently differently that it feels risky to have method 
inherited silently. The only other method not overriden is 
{{hasEnoughLiveData}} and we could easily use a {{private static}} helper for 
code reuse in that case. Anyway, I don't mind the inheritence terribly if you 
prefer but figured I'd mention it.
* Not sure about {{CQLGroupByLimits.forShortReadRetry()}}. I believe putting no 
limit on the number of rows (and only on the group) might lead to OOM.  In 
fact, I need to think more carefully about this but I'm not 100% sure that the 
short read logic isn't throw off by the fact that {{counted()}} returns a 
number of groups not rows.
* In {{GroupByQueryPager}}, both iterator implementation have a 
{{correctPageSize()}} method that is unused. And in the case of 
{{AggregationPartitionIterator}}, it seems fishy that it's not used.
* Shouldn't {{GroupByQueryPager.GoupByPartitionIterator.close()}} set 
{{closed}} if it isn't set?
* I think we lack some validation/handling of composite partition keys. If I 
have {{PRIMARY KEY ((a, b), c)}} and I do {{SELECT count( *) FROM t GROUP BY 
a}}, I believe this silently do the grouping by both {{a}} and {{b}}. We should 
reject this since we have no way to handle it.


Also, a bunch of minor nits:
* {{GroupByQueryPager.GroupByPartitionIterator.GroupByRowIterator}} could call 
{{rowIterator.partitionKey()}} rather than stores the partition key separately. 
Regarding that class, we also should remove all the {{@Override}} according to 
the [code style|https://wiki.apache.org/cassandra/CodeStyle].
* Why do we have {{rowPageSize}} in {{CQLGroupByLimits}}. The ctor seems to 
always make it equal to {{rowLimit}} and both fields are {{final}} (so will 
stay equal).
* In {{Selection.resultSetBuilder}}, the {{isJons}} parameter is a typo for 
{{isJson}}. I know this is not new to this patch but lets fix it rather than 
duplicate the typo.
* Your IDE is not respecting the import order of the code-style. It moved 
{{com.google}} imports after the {{org.apache}} ones in {{Selection.java}} for 
instance (see the [code style|https://wiki.apache.org/cassandra/CodeStyle]). I 
also don't love that it expands {{*}} as this adds noise to patches but that 
might be more personal.
* We can now remove the {{// Note that if there are some nodes in the cluster 
with a version less than 2.0, we can't use paging (CASSANDRA-6707)}} line (in 
{{SelectStatement.getDataLimits}}).
* Could add a comment in 
{{SelectStatement.RawStatement.getGroupBySpecification()}} as to why we don't 
do anything with partition key columns.
* Was there a reason to change {{s/cqlRowLimit/userLimit}} in the last 2 lines 
of {{SelectStatement.getDataLimits()}}? Pretty sure it doesn't matter but since 
I don't find the change particularly better I'm left wondering if I'm missing 
the reason for the change.
* Could be wrong, but it doesn't seem we need to add {{withUpdatedLimit()}} to 
{{ReadQuery}} (that is, we only use it on a {{ReadCommand}}). If true, I'd 
rather remove it from {{ReadQuery}} which allow to remove the unused 
implementation in {{SinglePartitionReadCommand.Group}}.
* Indentation for the parameters of {{GroupMaker.newInstance}} is off.
* It would make more sense for {{QueryPager.EMPTY.withUpdatedLimit()}} to throw 
{{UnsupportedOperationException}}.
* In {{DataLimits.CQLLimits.hasEnoughLiveData}}, a comment has been butchered.
* Some indentation is off in {{DataLimits.Serializer.deserialize}}.
* In {{GroupByQueryPager.handlePagingOff}}, the comment is incomplete.
* Unecessary import of {{Optional}} in {{Restrictions.java}}.


> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to