GROUPING SET operators for advanced aggregations

James Taylor (JIRA) Sat, 26 Mar 2016 13:46:47 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213194#comment-15213194
 ]


James Taylor commented on PHOENIX-1772:
---------------------------------------

Hi Prasad,
I think an interesting place to start would be to see how we can integrate with 
Kylin. Phoenix is aggressively moving to be on top of Calcite, also the 
underpinnings of Kylin (see our calcite branch where most all querying is 
already working). Assuming we do the work in the calcite branch, what would an 
integration look like? I think figuring out our options would be your main 
effort, with technical guidance from all three projects: Calcite, Kylin, and 
Phoenix. You'd need to drive it, though. Are you up for it, [~prasad]?
Regards,
James

> Add CUBE/ROLLUP/GROUPING SET operators for advanced aggregations
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-1772
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1772
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Jayapriya Surendran
>              Labels: gsoc2016, java, sql
>         Attachments: GSoCProposal.pdf
>
>
> I noticed from Phoenix language documentation ( 
> http://phoenix.apache.org/language/index.html ) that Phoenix is missing 
> CUBE/ROLLUP and GROUPING_SET operators which are already supported by other 
> similar projects like Apache Pig and Apache Hive. Here is brief overview of 
> my proposal (the syntax that is proposed below is same as PostgreSQL 
> https://wiki.postgresql.org/wiki/Grouping_Sets)
> *Proposed syntax for CUBE:*
> SELECT name, place, SUM(count) FROM cars GROUP BY CUBE(name, place);
> For every row that we process we need to emit 2^n combinations of rows where 
> n corresponds to number of aggregate columns. For the above example query, 
> for every row we need to emit 4 rows, one for each level of aggregations 
> {(name, place), (name, *), (*, place), (*, *)}.
> *Proposed syntax for ROLLUP:*
> SELECT name, place, SUM(count) FROM cars GROUP BY ROLLUP(name, place);
> For every row that we process we need to emit n+1 combinations of rows where 
> n corresponds to number of aggregate columns. For the above example query, 
> for every row we need to emit 3 rows, one for each hierarchical level of 
> aggregations {(name, place), (name, *), (*, *)}.
> *Propose syntax for GROUPING_SETS:*
> SELECT name, place, SUM(count) FROM cars GROUP BY GROUPING SETS(name, ());
> For every row that we process we need to emit n combinations of rows where n 
> corresponds to size of grouping set. For the above example query, for every 
> row we need to emit 2 rows, one for each specified level of aggregations 
> {(name, *), (*, *)}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-1772) Add CUBE/ROLLUP/GROUPING SET operators for advanced aggregations

Reply via email to