[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?
[ https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442154#comment-16442154 ] liyang commented on KYLIN-3342: --- I see. Then this Jira is invalid. Closing it. Thanks [~yaho]! > Cubing level calculation inconsistent? > -- > > Key: KYLIN-3342 > URL: https://issues.apache.org/jira/browse/KYLIN-3342 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Priority: Major > Attachments: KYLIN-3342.patch > > > Got below exception during cube build. > {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}} > {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}} > {{ at java.util.ArrayList.get(ArrayList.java:411)}} > {{ at > org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}} > {{ at > org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}} > {{ ... 6 more}} > > Found two way of calculating the level of cuboids > * via CuboidScheduler.getBuildLevel() > * via CuboidUtil.getLongestDepth(...) > We should settle down on one approach. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?
[ https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442078#comment-16442078 ] Zhong Yanghong commented on KYLIN-3342: --- For spark cubing, before getting the {{totalLevels}}, the statistics for the segment has been calculated. Therefore, it's OK to get it via {{CuboidScheduler.getBuildLevel()}}. However, for the first building of layered cubing, we don't know the statistics when create the {{CubingJob}}. Therefore, it's better for us to invoke {{CuboidUtil.getLongestDepth(...)}} to estimate a minimum {{totalLevels}} to reduce the layers for layered cubing. After calculating statistics, thee total depth may also change for {{TreeCuboidScheduler}}. Then in {{CuboidJob}}, there's a check for whether to skip. > Cubing level calculation inconsistent? > -- > > Key: KYLIN-3342 > URL: https://issues.apache.org/jira/browse/KYLIN-3342 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Priority: Major > Attachments: KYLIN-3342.patch > > > Got below exception during cube build. > {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}} > {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}} > {{ at java.util.ArrayList.get(ArrayList.java:411)}} > {{ at > org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}} > {{ at > org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}} > {{ ... 6 more}} > > Found two way of calculating the level of cuboids > * via CuboidScheduler.getBuildLevel() > * via CuboidUtil.getLongestDepth(...) > We should settle down on one approach. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?
[ https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442066#comment-16442066 ] Zhong Yanghong commented on KYLIN-3342: --- Hi [~liyang.g...@gmail.com], for getting the totalLevels, spark cubing and layered cubing are in different cases. Previously, there's a bug for spark cubing, which has been fixed by https://github.com/apache/kylin/commit/24042e2209d85b0c8de98a86d9a573aff182d9c9. Let's me explain the details. > Cubing level calculation inconsistent? > -- > > Key: KYLIN-3342 > URL: https://issues.apache.org/jira/browse/KYLIN-3342 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Priority: Major > Attachments: KYLIN-3342.patch > > > Got below exception during cube build. > {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}} > {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}} > {{ at java.util.ArrayList.get(ArrayList.java:411)}} > {{ at > org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}} > {{ at > org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}} > {{ at > org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}} > {{ ... 6 more}} > > Found two way of calculating the level of cuboids > * via CuboidScheduler.getBuildLevel() > * via CuboidUtil.getLongestDepth(...) > We should settle down on one approach. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3342) Cubing level calculation inconsistent?
liyang created KYLIN-3342: - Summary: Cubing level calculation inconsistent? Key: KYLIN-3342 URL: https://issues.apache.org/jira/browse/KYLIN-3342 Project: Kylin Issue Type: Bug Reporter: liyang Got below exception during cube build. {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}} {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}} {{ at java.util.ArrayList.get(ArrayList.java:411)}} {{ at org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}} {{ at org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}} {{ at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}} {{ at org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}} {{ ... 6 more}} Found two way of calculating the level of cuboids * via CuboidScheduler.getBuildLevel() * via CuboidUtil.getLongestDepth(...) We should settle down on one approach. -- This message was sent by Atlassian JIRA (v7.6.3#76005)