[
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108289#comment-15108289
]
Benjamin Lerer commented on CASSANDRA-10707:
--------------------------------------------
Taking into account the fact that we allow today queries like: {{SELECT
max((*)), min((*)), count((*)) FROM myTable;}}, I do not really think that we
can provide support for {{GROUP BY}} queries without allowing to group by
partition keys.
Some users might be interested by queries like: {{SELECT max((*)), min((*))
count((*)) FROM myTable GROUP BY partitionKey;}} or {{SELECT max((*)), min((*))
count((*)) FROM myTable WHERE partitionKey IN (1, 2, 3) GROUP BY partitionKey;}}
Now, it is clear that those queries are not recommended and that the timeouts
will probably need to be adjusted. As for the current aggregates queries a
warning will be logged to warn the users if the partition key is not restricted
by an equality.
The problem is not really the work needed to compute the aggregates. It is just
the fact that the data has to be retrieved from other nodes.
In the future, we might manage to push the aggregate computation to the
replicas but we are not there yet.
> Add support for Group By to Select statement
> --------------------------------------------
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
> Issue Type: Improvement
> Components: CQL
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP
> BY partitionKey, clustering0, clustering1;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)