hit-lacus commented on pull request #1778:
URL: https://github.com/apache/kylin/pull/1778#issuecomment-1000214302


   Find Exception.
   
   ```java
   2021-12-23 17:17:43,212 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in 
memory on cdh-worker-2:41969 (size: 28.6 KB, free: 912.2 MB)
   2021-12-23 17:17:43,212 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO spark.SparkContext: Created broadcast 7 from
   2021-12-23 17:17:43,371 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO mapred.FileInputFormat: Total input paths to process : 4
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 ERROR hive.CreateSparkHiveDictStep:
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
org.apache.spark.SparkException: Task not serializable
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
   2021-12-23 17:17:43,430 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2800)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2799)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset.collectAsList(Dataset.scala:2799)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.getPartitionDataCountMap(CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.createSparkHiveDict(CreateSparkHiveDictStep.java:183)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.execute(CreateSparkHiveDictStep.java:110)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
   2021-12-23 17:17:43,431 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at java.lang.reflect.Method.invoke(Method.java:498)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
Caused by: java.io.NotSerializableException: 
org.apache.kylin.job.common.PatternedLogger
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
Serialization stack:
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object not serializable (class: org.apache.kylin.job.common.PatternedLogger, 
value: org.apache.kylin.job.common.PatternedLogger@43905ade)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep, name: 
stepLogger, type: class org.apache.kylin.job.common.PatternedLogger)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep, 
org.apache.kylin.source.hive.CreateSparkHiveDictStep@6f67291f)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, name: 
this$0, type: class org.apache.kylin.source.hive.CreateSparkHiveDictStep)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, 
org.apache.kylin.source.hive.CreateSparkHiveDictStep$2@4f2ab774)
   2021-12-23 17:17:43,432 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.Dataset$$anonfun$43, name: f$5, type: 
interface org.apache.spark.api.java.function.MapPartitionsFunction)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.Dataset$$anonfun$43, <function1>)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.execution.MapPartitionsExec, name: func, 
type: interface scala.Function1)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.execution.MapPartitionsExec, MapPartitions 
<function1>, obj#50: java.lang.String
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
+- DeserializeToObject createexternalrow(dict_key#40.toString, 
StructField(dict_key,StringType,true)), obj#49: org.apache.spark.sql.Row
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
  +- InMemoryTableScan [dict_key#40]
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
        +- InMemoryRelation [dict_key#40], StorageLevel(disk, memory, 
deserialized, 1 replicas)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
              +- Scan hive 
yaqian.kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value
 [dict_key#40], HiveTableRelation 
`yaqian`.`kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value`,
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [dict_key#40], 
[dict_column#41], [isnotnull(dict_column#41), (dict_column#41 = 
KYLIN_SALES_OPS_USER_ID)]
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : )
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5, 
name: $outer, type: class org.apache.spark.sql.execution.MapPartitionsExec)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5, 
<function1>)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1, 
name: f$24, type: interface scala.Function1)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1, 
<function0>)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24, 
name: $outer, type: class 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24, 
<function3>)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface 
scala.Function3)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[36] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,433 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class 
org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.OneToOneDependency, 
org.apache.spark.OneToOneDependency@74a680d3)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.List$SerializationProxy, 
scala.collection.immutable.List$SerializationProxy@2bd24c09)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.$colon$colon, 
List(org.apache.spark.OneToOneDependency@74a680d3))
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD, name: 
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[37] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class 
org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.OneToOneDependency, 
org.apache.spark.OneToOneDependency@7ec9d826)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.List$SerializationProxy, 
scala.collection.immutable.List$SerializationProxy@420849f6)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.$colon$colon, 
List(org.apache.spark.OneToOneDependency@7ec9d826))
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD, name: 
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[38] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1, name: $outer, 
type: class org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1, <function0>)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, name: 
$outer, type: class org.apache.spark.rdd.RDD$$anonfun$collect$1)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, 
<function1>)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
   2021-12-23 17:17:43,434 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
... 37 more
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 ERROR hive.CreateSparkHiveDictStep:
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
org.apache.spark.SparkException: Task not serializable
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2800)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2799)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
   2021-12-23 17:17:43,435 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.sql.Dataset.collectAsList(Dataset.scala:2799)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.getPartitionDataCountMap(CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.createSparkHiveDict(CreateSparkHiveDictStep.java:183)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.source.hive.CreateSparkHiveDictStep.execute(CreateSparkHiveDictStep.java:110)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at java.lang.reflect.Method.invoke(Method.java:498)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
Caused by: java.io.NotSerializableException: 
org.apache.kylin.job.common.PatternedLogger
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
Serialization stack:
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object not serializable (class: org.apache.kylin.job.common.PatternedLogger, 
value: org.apache.kylin.job.common.PatternedLogger@43905ade)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep, name: 
stepLogger, type: class org.apache.kylin.job.common.PatternedLogger)
   2021-12-23 17:17:43,436 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep, 
org.apache.kylin.source.hive.CreateSparkHiveDictStep@6f67291f)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, name: 
this$0, type: class org.apache.kylin.source.hive.CreateSparkHiveDictStep)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, 
org.apache.kylin.source.hive.CreateSparkHiveDictStep$2@4f2ab774)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.Dataset$$anonfun$43, name: f$5, type: 
interface org.apache.spark.api.java.function.MapPartitionsFunction)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.Dataset$$anonfun$43, <function1>)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.execution.MapPartitionsExec, name: func, 
type: interface scala.Function1)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.execution.MapPartitionsExec, MapPartitions 
<function1>, obj#50: java.lang.String
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
+- DeserializeToObject createexternalrow(dict_key#40.toString, 
StructField(dict_key,StringType,true)), obj#49: org.apache.spark.sql.Row
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
  +- InMemoryTableScan [dict_key#40]
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
        +- InMemoryRelation [dict_key#40], StorageLevel(disk, memory, 
deserialized, 1 replicas)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
              +- Scan hive 
yaqian.kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value
 [dict_key#40], HiveTableRelation 
`yaqian`.`kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value`,
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [dict_key#40], 
[dict_column#41], [isnotnull(dict_column#41), (dict_column#41 = 
KYLIN_SALES_OPS_USER_ID)]
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : )
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5, 
name: $outer, type: class org.apache.spark.sql.execution.MapPartitionsExec)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5, 
<function1>)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1, 
name: f$24, type: interface scala.Function1)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1, 
<function0>)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24, 
name: $outer, type: class 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1)
   2021-12-23 17:17:43,437 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24, 
<function3>)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface 
scala.Function3)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[36] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class 
org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.OneToOneDependency, 
org.apache.spark.OneToOneDependency@74a680d3)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.List$SerializationProxy, 
scala.collection.immutable.List$SerializationProxy@2bd24c09)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.$colon$colon, 
List(org.apache.spark.OneToOneDependency@74a680d3))
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD, name: 
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[37] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class 
org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.OneToOneDependency, 
org.apache.spark.OneToOneDependency@7ec9d826)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.List$SerializationProxy, 
scala.collection.immutable.List$SerializationProxy@420849f6)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class scala.collection.immutable.$colon$colon, 
List(org.apache.spark.OneToOneDependency@7ec9d826))
   2021-12-23 17:17:43,438 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD, name: 
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[38] at 
collectAsList at CreateSparkHiveDictStep.java:262)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1, name: $outer, 
type: class org.apache.spark.rdd.RDD)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1, <function0>)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, name: 
$outer, type: class org.apache.spark.rdd.RDD$$anonfun$collect$1)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, 
<function1>)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
   2021-12-23 17:17:43,439 INFO  [pool-20-thread-1] spark.SparkExecutable:41 :  
... 37 more
   2021-12-23 17:17:43,536 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO zookeeper.ZookeeperDistributedLock: 1-12139@cdh-worker-2 
purged all locks under /mr_dict_lock/kylin_sales_cube_test_dic_SPARK
   2021-12-23 17:17:43,536 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO hive.CreateSparkHiveDictStep: zookeeper unlock path 
:/mr_dict_lock/kylin_sales_cube_test_dic_SPARK
   2021-12-23 17:17:43,536 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO hive.MRHiveDictUtil: 
27ee316a-470b-4633-4a75-8de88142c05f unlock full lock path 
:/mr_dict_lock/kylin_sales_cube_test_dic_SPARK success
   2021-12-23 17:17:43,550 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO zookeeper.ZookeeperDistributedLock: 1-12139@cdh-worker-2 
purged all locks under /mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK
   2021-12-23 17:17:43,550 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO hive.CreateSparkHiveDictStep: zookeeper unlock path 
:/mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK
   2021-12-23 17:17:43,550 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO hive.MRHiveDictUtil: 
27ee316a-470b-4633-4a75-8de88142c05f unlock full lock path 
:/mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK success
   2021-12-23 17:17:43,551 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO util.ZKUtil: Going to remove 1 cached curator clients
   2021-12-23 17:17:43,554 INFO  [pool-20-thread-1] spark.SparkExecutable:41 : 
21/12/23 17:17:43 INFO spark.SparkContext: Invoking stop() from shutdown hook
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to