[ https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127465#comment-17127465 ]
wangrupeng commented on KYLIN-4498: ----------------------------------- Cube Planner Proposal Cube Planner checks the costs and benefits of each dimension combination, and selects cost-effective dimension combination sets to improve cube build efficiency and query performance. Cube planner has two phases. Cube planner is degined and contributed by ebay. See more about the principle of cube planner from here(https://tech.ebayinc.com/engineering/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/). In my opinion, to let cube planner support Kylin on Parquet, we need make some change to current spark engine for building cube. My suggestion is as follows: The front-end interaction remains the same as before. Phase1(Building cube at first time): 1. Add a new step to calculate each cuboid rows with spark before the step of cube building(Now Kylin on Parquet has two steps for cube building). 2. During the step of cube building, recommend cuboids list with Greedy algorithm or Genetic algorithm before building cube. The code of these two algorithms can be reused. Pase2(Cube has been used for a while) 1. Using System cube which now can be used normally to collect query metrics(Including cuboids scanning rows and scanning bytes) 2. Add a new spark job to optimize and rebuild cube with the information collected by System cube 3. The steps of the new optimized job: a. Using query metrics information to recommend cuboid b. Rebuild old segment by remove non-needed cuboids and adding needed cuboid, Kylin on Paruqet building engine can support only adding cuboids(now we also call "layouts") we need without rebuild all cuboids of the segment c. Update metadata > CubePlaner for Kylin on Parquet > ------------------------------- > > Key: KYLIN-4498 > URL: https://issues.apache.org/jira/browse/KYLIN-4498 > Project: Kylin > Issue Type: New Feature > Reporter: wangrupeng > Assignee: wangrupeng > Priority: Minor > Fix For: v4.0.0-beta > > > CubePlanner still doesn't support Kylin on Parquet yet. We need this to be > more resource efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005)