Hi Kang-Sen,

It looks like there is a "\n" in your source data of a column with double type. 


--


Best regards,

 

Ni Chunen / George


At 2019-04-08 23:10:05, "Lu, Kang-Sen" <[email protected]> wrote:


I am running kylin 2.5.1.

 

When I am building a cube with spark engine, I got the following error at “#4 
Step Name: Extract Fact Table Distinct Columns”.

 

The log shows the following exception:

 

2019-04-08 12:59:10,375 WARN scheduler.TaskSetManager: Lost task 5.0 in stage 
0.0 (TID 0, hadoop9, executor 1): java.lang.NumberFormatException: For input 
string: "\N"

        at 
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)

        at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)

        at java.lang.Double.parseDouble(Double.java:538)

        at 
org.apache.kylin.engine.mr.steps.SelfDefineSortableKey.init(SelfDefineSortableKey.java:57)

        at 
org.apache.kylin.engine.mr.steps.SelfDefineSortableKey.init(SelfDefineSortableKey.java:66)

        at 
org.apache.kylin.engine.spark.SparkFactDistinct$FlatOutputFucntion.addFieldValue(SparkFactDistinct.java:444)

        at 
org.apache.kylin.engine.spark.SparkFactDistinct$FlatOutputFucntion.call(SparkFactDistinct.java:315)

        at 
org.apache.kylin.engine.spark.SparkFactDistinct$FlatOutputFucntion.call(SparkFactDistinct.java:226)

        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)

        at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)

        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:797)

        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:797)

        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)

        at org.apache.spark.scheduler.Task.run(Task.scala:99)

        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        at java.lang.Thread.run(Thread.java:745)

 

Anybody saw this same problem?

 

Thanks.

 

Kang-sen




Notice: This e-mail together with any attachments may contain information of 
Ribbon Communications Inc. that is confidential and/or proprietary for the sole 
use of the intended recipient. Any review, disclosure, reliance or distribution 
by others or forwarding without express permission is strictly prohibited. If 
you are not the intended recipient, please notify the sender immediately and 
then delete all copies, including any attachments.

Reply via email to