[ 
https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127465#comment-17127465
 ] 

wangrupeng commented on KYLIN-4498:
-----------------------------------

Cube Planner Proposal

Cube Planner checks the costs and benefits of each dimension combination, and 
selects cost-effective dimension combination sets to improve cube build 
efficiency and query performance. Cube planner has two phases.
Cube planner is degined and contributed by ebay. See more about the principle 
of cube planner from 
here(https://tech.ebayinc.com/engineering/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/).

In my opinion, to let cube planner support Kylin on Parquet, we need make some 
change to current spark engine for building cube. My suggestion is as follows:
The front-end interaction remains the same as before. 
Phase1(Building cube at first time):
1. Add a new step to calculate each cuboid rows with spark before the step of 
cube building(Now Kylin on Parquet has two steps for cube building).
2. During the step of cube building, recommend cuboids list with Greedy 
algorithm or Genetic algorithm before building cube. The code of these two 
algorithms can be reused.

Pase2(Cube has been used for a while)
1. Using System cube which now can be used normally to collect query 
metrics(Including cuboids scanning rows and scanning bytes) 
2. Add a new spark job to optimize and rebuild cube with the information 
collected by System cube
3. The steps of the new optimized job:
    a. Using query metrics information to recommend cuboid  
    b. Rebuild old segment by remove non-needed cuboids and adding needed 
cuboid, Kylin on Paruqet building engine can support only adding cuboids(now we 
also call "layouts") we need without rebuild all cuboids of the segment
    c. Update metadata

> CubePlaner for Kylin on Parquet
> -------------------------------
>
>                 Key: KYLIN-4498
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4498
>             Project: Kylin
>          Issue Type: New Feature
>            Reporter: wangrupeng
>            Assignee: wangrupeng
>            Priority: Minor
>             Fix For: v4.0.0-beta
>
>
> CubePlanner still doesn't support Kylin on Parquet  yet.  We need this to be 
> more resource efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to