[ 
https://issues.apache.org/jira/browse/KYLIN-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16724085#comment-16724085
 ] 

ASF GitHub Bot commented on KYLIN-3699:
---------------------------------------

shaofengshi closed pull request #404: KYLIN-3699 Spark cubing failed when build 
with empty data
URL: https://github.com/apache/kylin/pull/404
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java 
b/core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java
index 5d09ea7347..4afce58a23 100644
--- a/core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java
+++ b/core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java
@@ -256,8 +256,8 @@ public boolean accept(Path path) {
 
         if (fileStatuses != null && fileStatuses.length > 0) {
             return isSequenceFile(conf, fileStatuses[0].getPath());
+        } else {
+            return true;
         }
-
-        return false;
     }
 }
diff --git 
a/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/CubeStatsReader.java
 
b/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/CubeStatsReader.java
index be5bdbf44b..58f0e66419 100644
--- 
a/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/CubeStatsReader.java
+++ 
b/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/CubeStatsReader.java
@@ -299,7 +299,7 @@ public double estimateLayerSize(int level) {
         Map<Long, Double> cuboidSizeMap = getCuboidSizeMap();
         double ret = 0;
         for (Long cuboidId : layeredCuboids.get(level)) {
-            ret += cuboidSizeMap.get(cuboidId);
+            ret += cuboidSizeMap.get(cuboidId) == null ? 0.0 : 
cuboidSizeMap.get(cuboidId);
         }
 
         logger.info("Estimating size for layer {}, all cuboids are {}, total 
size is {}", level,


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> SparkCubingByLayer. Root cause: null 
> -------------------------------------
>
>                 Key: KYLIN-3699
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3699
>             Project: Kylin
>          Issue Type: Bug
>          Components: Spark Engine
>    Affects Versions: v2.5.0
>         Environment: hdp.version: 2.5.0.0-1245
> kylin: 2.5.0 (apache-kylin-2.5.0-bin-hbase1x)
>            Reporter: 风语者
>            Assignee: Chao Long
>            Priority: Major
>             Fix For: v2.6.0
>
>         Attachments: Error.png, mapreduce step8.png, 异常信息.png, 异常步骤.png, 
> 构建结果.png
>
>
> 你好:
>       我最近在使用 Kylin, 在使用 spark engine 构建 cube 的时候经常会出现一个异常,不知道怎么解决。异常如下:
> Exception in thread "main" java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: null
>       at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>       at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:744)
>       at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
>       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(CubeStatsReader.java:297)
>       at 
> org.apache.kylin.engine.spark.SparkUtil.estimateLayerPartitionNum(SparkUtil.java:108)
>       at 
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:182)
>       at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>         我发现,出现这个异常的时候,fact table 与 lookup table 关联后记录条数为0,就会触发这个问题。会在 
> {color:#FF0000}*8 Step Name: Build Cube with Spark*{color} 报错。spark 引擎用的 
> kylin 自带的 spark 2.1.2. 集群环境:hdp.version: 2.5.0.0-1245
>        如果用 mapreduce 引擎构建的话,是可以正常构建的,没有数据,流程也可以正常走通。
>         不知道这个问题该怎么解决。 
>         希望可以得到回复。
>         谢谢谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to