[jira] [Comment Edited] (CASSANDRA-9767) Allow the selection of columns together with aggregates

Ajay (JIRA) Thu, 09 Jul 2015 07:40:39 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620583#comment-14620583
 ]


Ajay edited comment on CASSANDRA-9767 at 7/9/15 2:33 PM:
---------------------------------------------------------

Yes the perfect example could have been GROUP BY. But I was trying to give an 
example within a single partition. But the idea is the same. Now with aggregate 
functions in 2.2, the CQL is supporting pulling columns along with other 
aggregate functions (As I see from the samples). So it should work for COUNT 
(*) as well.


was (Author: ajaygarga):
Yes the perfect example could have been GROUP BY. But I was trying to give an 
example within a single partition. But the idea is the same. Now with aggregate 
functions in 2.2, the CQL is supporting pulling columns along with other 
aggregate functions (from the sample I noticed). So it should work for COUNT 
(*) as well.

> Allow the selection of columns together with aggregates
> -------------------------------------------------------
>
>                 Key: CASSANDRA-9767
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9767
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>         Environment: Cassandra 2.0.16
> Ubuntu 15.04
>            Reporter: Ajay
>            Assignee: Benjamin Lerer
>            Priority: Minor
>
> Lets assume we have a column family as below:
> create table sample ( track_id int, user_id int, country varchar, primary key 
> ((track_id), user_id));
> where track_id is the partition key.
> Now to aggregate the number of rows for a single track_id, we can query using 
> CQL as below:
> select count(*) where track_id = 1 and user_id = 1;
> But that will return only the count. If we need the other columns along with 
> the count, we cannot query as below as it throws error:
>  select count(*), country  from sample where track_id = 1 and user_id = 1;
> Bad Request: line 1:15 mismatched input ',' expecting K_FROM.
> In this case, all rows for a given track_id and user_id will have the same 
> value for country. So we should be able to query as above.  Also in SQL, it 
> is possible to select columns along with aggregate functions.
> Though I know that Cassandra is not analytics (unlike Hadoop and Spark), we 
> need some basic aggregate functions like min, max, avg etc....Though 
> performance wise it might not be efficient, but it is better done in the 
> cassandra side (as it uses native protocol) than we getting all rows in the 
> client and doing the basic aggregation.  It cannot used just as a data store 
> (as garbage-in garbage-out). In that context, currently CQL is pretty 
> limited. Just for getting data out of cassandra, we will have to spark though 
> we will not be doing much analytics on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9767) Allow the selection of columns together with aggregates

Reply via email to