[
https://issues.apache.org/jira/browse/TAJO-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386219#comment-14386219
]
Atri Sharma commented on TAJO-256:
----------------------------------
I understand that CUBE's algorithm is exponential and I believe that can be
memory bounded given that we smartly handle each grouping set processing.
What I plan to do is this:
Consider a ROLLUP operation. ROLLUP (a,b,c). The grouping sets generated for
this are (a,b,c),(a,b),(a),(). Now, this set of GS can be processed in a single
scan of the data set (since 3 of the GS are subsets of the largest set). Hence,
what we can do is to generate the needed GS, group them into minimum possible
number of ROLLUP GS sets, and then process each GS set independently, then
aggregate the results.
Makes sense?
> Support data cube (Umbrella)
> ----------------------------
>
> Key: TAJO-256
> URL: https://issues.apache.org/jira/browse/TAJO-256
> Project: Tajo
> Issue Type: New Feature
> Components: catalog, distributed query plan, parser
> Reporter: Jihoon Son
> Assignee: Jihoon Son
>
> This issue includes follows sub issues
> * SQL support of group by extensions (GROUPING SETS, CUBE, ROLLUP)
> * Query execution of group by extensions
> * GROUPING() function
> * Data cube materialization process
> * Cube schema maintenance
> * Sample-based cost estimation
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)