[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?

2018-04-18 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442154#comment-16442154
 ] 

liyang commented on KYLIN-3342:
---

I see. Then this Jira is invalid. Closing it.

Thanks [~yaho]!

> Cubing level calculation inconsistent?
> --
>
> Key: KYLIN-3342
> URL: https://issues.apache.org/jira/browse/KYLIN-3342
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Priority: Major
> Attachments: KYLIN-3342.patch
>
>
> Got below exception during cube build.
> {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}}
> {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}}
> {{ at java.util.ArrayList.get(ArrayList.java:411)}}
> {{ at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}}
> {{ at 
> org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}}
> {{ ... 6 more}}
>  
> Found two way of calculating the level of cuboids
>  * via CuboidScheduler.getBuildLevel()
>  * via CuboidUtil.getLongestDepth(...)
> We should settle down on one approach.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?

2018-04-18 Thread Zhong Yanghong (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442078#comment-16442078
 ] 

Zhong Yanghong commented on KYLIN-3342:
---

For spark cubing, before getting the {{totalLevels}}, the statistics for the 
segment has been calculated. Therefore, it's OK to get it via 
{{CuboidScheduler.getBuildLevel()}}. However, for the first building of layered 
cubing, we don't know the statistics when create the {{CubingJob}}. Therefore, 
it's better for us to invoke {{CuboidUtil.getLongestDepth(...)}} to estimate a 
minimum {{totalLevels}} to reduce the layers for layered cubing. After 
calculating statistics, thee total depth may also change for 
{{TreeCuboidScheduler}}. Then in {{CuboidJob}}, there's a check for whether to 
skip.

> Cubing level calculation inconsistent?
> --
>
> Key: KYLIN-3342
> URL: https://issues.apache.org/jira/browse/KYLIN-3342
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Priority: Major
> Attachments: KYLIN-3342.patch
>
>
> Got below exception during cube build.
> {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}}
> {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}}
> {{ at java.util.ArrayList.get(ArrayList.java:411)}}
> {{ at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}}
> {{ at 
> org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}}
> {{ ... 6 more}}
>  
> Found two way of calculating the level of cuboids
>  * via CuboidScheduler.getBuildLevel()
>  * via CuboidUtil.getLongestDepth(...)
> We should settle down on one approach.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?

2018-04-18 Thread Zhong Yanghong (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442066#comment-16442066
 ] 

Zhong Yanghong commented on KYLIN-3342:
---

Hi [~liyang.g...@gmail.com], for getting the totalLevels, spark cubing and 
layered cubing are in different cases. Previously, there's a bug for spark 
cubing, which has been fixed by 
https://github.com/apache/kylin/commit/24042e2209d85b0c8de98a86d9a573aff182d9c9.
 Let's me explain the details.

> Cubing level calculation inconsistent?
> --
>
> Key: KYLIN-3342
> URL: https://issues.apache.org/jira/browse/KYLIN-3342
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Priority: Major
> Attachments: KYLIN-3342.patch
>
>
> Got below exception during cube build.
> {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}}
> {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}}
> {{ at java.util.ArrayList.get(ArrayList.java:411)}}
> {{ at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}}
> {{ at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}}
> {{ at 
> org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}}
> {{ ... 6 more}}
>  
> Found two way of calculating the level of cuboids
>  * via CuboidScheduler.getBuildLevel()
>  * via CuboidUtil.getLongestDepth(...)
> We should settle down on one approach.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3342) Cubing level calculation inconsistent?

2018-04-18 Thread liyang (JIRA)
liyang created KYLIN-3342:
-

 Summary: Cubing level calculation inconsistent?
 Key: KYLIN-3342
 URL: https://issues.apache.org/jira/browse/KYLIN-3342
 Project: Kylin
  Issue Type: Bug
Reporter: liyang


Got below exception during cube build.

{{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}}
{{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}}
{{ at java.util.ArrayList.get(ArrayList.java:411)}}
{{ at 
org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}}
{{ at 
org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}}
{{ at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}}
{{ at org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}}
{{ ... 6 more}}

 

Found two way of calculating the level of cuboids
 * via CuboidScheduler.getBuildLevel()
 * via CuboidUtil.getLongestDepth(...)

We should settle down on one approach.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)