[
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076378#comment-15076378
]
Benjamin Lerer edited comment on CASSANDRA-10707 at 1/1/16 9:36 PM:
--------------------------------------------------------------------
Both will be supported.
What will not be supported is a {{group by}} clause where only a part of the
partition key will be specified. For example, if a table has a primary key like
{{PRIMARY KEY((partitionKey1, partitionKey2) clustering1, clustering2)}}, the
following query will not be supported:
{{SELECT partitionKey1, MAX(value) FROM myTable GROUP BY partitionKey1}}
As for the aggregates, the grouping will be performed on the coordinator node.
By consequence, if the driver use the Token aware policy, a query containing a
partition key predicate will be more efficient as the aggregates will be built
on the node where the data are located.
>From the syntax point of view, the queries:
{{SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable WHERE
partitionKey=5 GROUP BY partitionKey, clusteringColumn1;}}
and {{SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable WHERE
partitionKey=5 GROUP BY clusteringColumn1;}} will be both supported due to the
fact that the {{partitionKey}} column is restricted by an {{=}} operator.
was (Author: blerer):
Both will be supported.
What will not be supported is a {{group by}} clause were only a part of the
partition key will be specified. For example, if a table has a primary key like
{{PRIMARY KEY((partitionKey1, partitionKey2) clustering1, clustering2)}}, the
following query will not be supported:
{{SELECT partitionKey1, MAX(value) FROM myTable GROUP BY partitionKey1}}
As for the aggregates, the grouping will be performed on the coordinator node.
By consequence, if the driver use the Token aware policy, a query containing a
partition key predicate will be more efficient as the aggregates will be built
on the node where the data are located.
>From the syntax point of view, the queries:
{{SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable WHERE
partitionKey=5 GROUP BY partitionKey, clusteringColumn1;}}
and {{SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable WHERE
partitionKey=5 GROUP BY clusteringColumn1;}} will be both supported due to the
fact that the {{partitionKey}} column is restricted by an {{=}} operator.
> Add support for Group By to Select statement
> --------------------------------------------
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
> Issue Type: Improvement
> Components: CQL
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP
> BY partitionKey, clustering0, clustering1;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)