[ 
https://issues.apache.org/jira/browse/TAJO-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386219#comment-14386219
 ] 

Atri Sharma commented on TAJO-256:
----------------------------------

I understand that CUBE's algorithm is exponential and I believe that can be 
memory bounded given that we smartly handle each grouping set processing.

What I plan to do is this:

Consider a ROLLUP operation. ROLLUP (a,b,c). The grouping sets generated for 
this are (a,b,c),(a,b),(a),(). Now, this set of GS can be processed in a single 
scan of the data set (since 3 of the GS are subsets of the largest set). Hence, 
what we can do is to generate the needed GS, group them into minimum possible 
number of ROLLUP GS sets, and then process each GS set independently, then 
aggregate the results.

Makes sense?

> Support data cube (Umbrella)
> ----------------------------
>
>                 Key: TAJO-256
>                 URL: https://issues.apache.org/jira/browse/TAJO-256
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, distributed query plan, parser
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>
> This issue includes follows sub issues
> * SQL support of group by extensions (GROUPING SETS, CUBE, ROLLUP)
> * Query execution of group by extensions
> * GROUPING() function
> * Data cube materialization process
> * Cube schema maintenance
> * Sample-based cost estimation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to