hit-lacus commented on pull request #1778:
URL: https://github.com/apache/kylin/pull/1778#issuecomment-1000214302
Find Exception.
```java
2021-12-23 17:17:43,212 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in
memory on cdh-worker-2:41969 (size: 28.6 KB, free: 912.2 MB)
2021-12-23 17:17:43,212 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO spark.SparkContext: Created broadcast 7 from
2021-12-23 17:17:43,371 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO mapred.FileInputFormat: Total input paths to process : 4
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 ERROR hive.CreateSparkHiveDictStep:
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
org.apache.spark.SparkException: Task not serializable
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
2021-12-23 17:17:43,430 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2800)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2799)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset.collectAsList(Dataset.scala:2799)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.getPartitionDataCountMap(CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.createSparkHiveDict(CreateSparkHiveDictStep.java:183)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.execute(CreateSparkHiveDictStep.java:110)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
2021-12-23 17:17:43,431 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at java.lang.reflect.Method.invoke(Method.java:498)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
Caused by: java.io.NotSerializableException:
org.apache.kylin.job.common.PatternedLogger
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
Serialization stack:
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object not serializable (class: org.apache.kylin.job.common.PatternedLogger,
value: org.apache.kylin.job.common.PatternedLogger@43905ade)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep, name:
stepLogger, type: class org.apache.kylin.job.common.PatternedLogger)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep,
org.apache.kylin.source.hive.CreateSparkHiveDictStep@6f67291f)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, name:
this$0, type: class org.apache.kylin.source.hive.CreateSparkHiveDictStep)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep$2,
org.apache.kylin.source.hive.CreateSparkHiveDictStep$2@4f2ab774)
2021-12-23 17:17:43,432 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.Dataset$$anonfun$43, name: f$5, type:
interface org.apache.spark.api.java.function.MapPartitionsFunction)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.Dataset$$anonfun$43, <function1>)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.execution.MapPartitionsExec, name: func,
type: interface scala.Function1)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.execution.MapPartitionsExec, MapPartitions
<function1>, obj#50: java.lang.String
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- DeserializeToObject createexternalrow(dict_key#40.toString,
StructField(dict_key,StringType,true)), obj#49: org.apache.spark.sql.Row
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- InMemoryTableScan [dict_key#40]
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- InMemoryRelation [dict_key#40], StorageLevel(disk, memory,
deserialized, 1 replicas)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- Scan hive
yaqian.kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value
[dict_key#40], HiveTableRelation
`yaqian`.`kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [dict_key#40],
[dict_column#41], [isnotnull(dict_column#41), (dict_column#41 =
KYLIN_SALES_OPS_USER_ID)]
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 : )
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5,
name: $outer, type: class org.apache.spark.sql.execution.MapPartitionsExec)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5,
<function1>)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1,
name: f$24, type: interface scala.Function1)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1,
<function0>)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class:
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24,
name: $outer, type: class
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24,
<function3>)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface
scala.Function3)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[36] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,433 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class
org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.OneToOneDependency,
org.apache.spark.OneToOneDependency@74a680d3)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.List$SerializationProxy,
scala.collection.immutable.List$SerializationProxy@2bd24c09)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.$colon$colon,
List(org.apache.spark.OneToOneDependency@74a680d3))
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD, name:
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[37] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class
org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.OneToOneDependency,
org.apache.spark.OneToOneDependency@7ec9d826)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.List$SerializationProxy,
scala.collection.immutable.List$SerializationProxy@420849f6)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.$colon$colon,
List(org.apache.spark.OneToOneDependency@7ec9d826))
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD, name:
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[38] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1, name: $outer,
type: class org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1, <function0>)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, name:
$outer, type: class org.apache.spark.rdd.RDD$$anonfun$collect$1)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15,
<function1>)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
2021-12-23 17:17:43,434 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
... 37 more
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 ERROR hive.CreateSparkHiveDictStep:
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
org.apache.spark.SparkException: Task not serializable
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2800)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2799)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
2021-12-23 17:17:43,435 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.sql.Dataset.collectAsList(Dataset.scala:2799)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.getPartitionDataCountMap(CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.createSparkHiveDict(CreateSparkHiveDictStep.java:183)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.source.hive.CreateSparkHiveDictStep.execute(CreateSparkHiveDictStep.java:110)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at java.lang.reflect.Method.invoke(Method.java:498)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
Caused by: java.io.NotSerializableException:
org.apache.kylin.job.common.PatternedLogger
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
Serialization stack:
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object not serializable (class: org.apache.kylin.job.common.PatternedLogger,
value: org.apache.kylin.job.common.PatternedLogger@43905ade)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep, name:
stepLogger, type: class org.apache.kylin.job.common.PatternedLogger)
2021-12-23 17:17:43,436 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep,
org.apache.kylin.source.hive.CreateSparkHiveDictStep@6f67291f)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.kylin.source.hive.CreateSparkHiveDictStep$2, name:
this$0, type: class org.apache.kylin.source.hive.CreateSparkHiveDictStep)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.kylin.source.hive.CreateSparkHiveDictStep$2,
org.apache.kylin.source.hive.CreateSparkHiveDictStep$2@4f2ab774)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.Dataset$$anonfun$43, name: f$5, type:
interface org.apache.spark.api.java.function.MapPartitionsFunction)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.Dataset$$anonfun$43, <function1>)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.execution.MapPartitionsExec, name: func,
type: interface scala.Function1)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.execution.MapPartitionsExec, MapPartitions
<function1>, obj#50: java.lang.String
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- DeserializeToObject createexternalrow(dict_key#40.toString,
StructField(dict_key,StringType,true)), obj#49: org.apache.spark.sql.Row
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- InMemoryTableScan [dict_key#40]
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- InMemoryRelation [dict_key#40], StorageLevel(disk, memory,
deserialized, 1 replicas)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
+- Scan hive
yaqian.kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value
[dict_key#40], HiveTableRelation
`yaqian`.`kylin_intermediate_kylin_sales_cube_test_dic_spark_898e2825_6401_9864_424d_190fd4868a6f_distinct_value`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [dict_key#40],
[dict_column#41], [isnotnull(dict_column#41), (dict_column#41 =
KYLIN_SALES_OPS_USER_ID)]
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 : )
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5,
name: $outer, type: class org.apache.spark.sql.execution.MapPartitionsExec)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5,
<function1>)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1,
name: f$24, type: interface scala.Function1)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1,
<function0>)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class:
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24,
name: $outer, type: class
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1)
2021-12-23 17:17:43,437 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24,
<function3>)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface
scala.Function3)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[36] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class
org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.OneToOneDependency,
org.apache.spark.OneToOneDependency@74a680d3)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.List$SerializationProxy,
scala.collection.immutable.List$SerializationProxy@2bd24c09)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.$colon$colon,
List(org.apache.spark.OneToOneDependency@74a680d3))
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD, name:
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[37] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class
org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.OneToOneDependency,
org.apache.spark.OneToOneDependency@7ec9d826)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeObject data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.List$SerializationProxy,
scala.collection.immutable.List$SerializationProxy@420849f6)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- writeReplace data (class: scala.collection.immutable.List$SerializationProxy)
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class scala.collection.immutable.$colon$colon,
List(org.apache.spark.OneToOneDependency@7ec9d826))
2021-12-23 17:17:43,438 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD, name:
org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[38] at
collectAsList at CreateSparkHiveDictStep.java:262)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1, name: $outer,
type: class org.apache.spark.rdd.RDD)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1, <function0>)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- field (class: org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15, name:
$outer, type: class org.apache.spark.rdd.RDD$$anonfun$collect$1)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
- object (class org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$15,
<function1>)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
2021-12-23 17:17:43,439 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
... 37 more
2021-12-23 17:17:43,536 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO zookeeper.ZookeeperDistributedLock: 1-12139@cdh-worker-2
purged all locks under /mr_dict_lock/kylin_sales_cube_test_dic_SPARK
2021-12-23 17:17:43,536 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO hive.CreateSparkHiveDictStep: zookeeper unlock path
:/mr_dict_lock/kylin_sales_cube_test_dic_SPARK
2021-12-23 17:17:43,536 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO hive.MRHiveDictUtil:
27ee316a-470b-4633-4a75-8de88142c05f unlock full lock path
:/mr_dict_lock/kylin_sales_cube_test_dic_SPARK success
2021-12-23 17:17:43,550 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO zookeeper.ZookeeperDistributedLock: 1-12139@cdh-worker-2
purged all locks under /mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK
2021-12-23 17:17:43,550 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO hive.CreateSparkHiveDictStep: zookeeper unlock path
:/mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK
2021-12-23 17:17:43,550 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO hive.MRHiveDictUtil:
27ee316a-470b-4633-4a75-8de88142c05f unlock full lock path
:/mr_dict_ephemeral_lock/kylin_sales_cube_test_dic_SPARK success
2021-12-23 17:17:43,551 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO util.ZKUtil: Going to remove 1 cached curator clients
2021-12-23 17:17:43,554 INFO [pool-20-thread-1] spark.SparkExecutable:41 :
21/12/23 17:17:43 INFO spark.SparkContext: Invoking stop() from shutdown hook
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]