[jira] [Commented] (TAJO-256) Support data cube (Umbrella)

Jihoon Son (JIRA) Thu, 26 Mar 2015 04:38:43 -0700

    [ 
https://issues.apache.org/jira/browse/TAJO-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381752#comment-14381752
 ]


Jihoon Son commented on TAJO-256:
---------------------------------

Currently, only the grammar part is implmented. You can see it at SQLParser.g4.
The remaining parts are logical planning, global planning, physical planning, 
and query execution. 
For the planning part, there are some codes which I wrote long time ago.
IMO, it would be better to start from the beginning. Here are some reaons.

As you may know, the naive algorithm for the cube operation is the consecutive 
multiple group-bys for every combination of aggregation keys. Since this naive 
method incurs the huge overhead, we should find a better solution.

As commented above, I tryied to resolve this problem by sharing common group-by 
results. In addition, to represent sharing data between group-by plans, I tried 
to extend Tajo's query plan from a Tree form to a DAG form (TAJO-266). This 
work is contained in a separate branch, called DAG-execplan. However, that 
branch has not been maintained for a long time. In addition, I'm not sure about 
this approach anymore. I think that there will be a better and much easier 
solution.

Interestingly, some papers have recently been published for efficient execution 
of cube operation in distributed systems. I think we should survey those 
materials.

> Support data cube (Umbrella)
> ----------------------------
>
>                 Key: TAJO-256
>                 URL: https://issues.apache.org/jira/browse/TAJO-256
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, distributed query plan, parser
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>
> This issue includes follows sub issues
> * SQL support of group by extensions (GROUPING SETS, CUBE, ROLLUP)
> * Query execution of group by extensions
> * GROUPING() function
> * Data cube materialization process
> * Cube schema maintenance
> * Sample-based cost estimation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TAJO-256) Support data cube (Umbrella)

Reply via email to