Hadoop:3.1.1
hive:3.1.0
Hbase:2.0.2
Zookeeper:3.4.6
Spark:2.3.0
I'm using apache-kylin-3.1.3-bin-hadoop3.tar.gz
Hello everyone, has anyone had a similar problem with me? Is the Hbase version
inconsistent? Is there any solution that does not need to replace the Hbase
environment?
22/04/21 16:49:06 INFO Client: Application report for
application_1650425060534_0058 (state: FINISHED)
22/04/21 16:49:06 INFO Client:
client token: N/A
diagnostics: User class threw exception: java.lang.RuntimeException:
error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause:
Job aborted.
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)
Caused by: org.apache.spark.SparkException: Job aborted.
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
at
org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
at
org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238)
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 6 more
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage
1.0 (TID 8, c0, executor 5): org.apache.spark.SparkException: Task failed while
writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.hbase.util.FSUtils.setStoragePolicy(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;Ljava/lang/String;)V
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3.configureStoragePolicy(HFileOutputFormat3.java:468)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:287)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:243)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2039)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2060)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092)
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
... 16 more
Caused by: org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.hbase.util.FSUtils.setStoragePolicy(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;Ljava/lang/String;)V
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3.configureStoragePolicy(HFileOutputFormat3.java:468)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:287)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:243)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
ApplicationMaster host: 192.168.70.97
ApplicationMaster RPC port: 0
queue: default
start time: 1650530850727
final status: FAILED
tracking URL: http://c0:8088/proxy/application_1650425060534_0058/
user: hdp
Exception in thread "main" org.apache.spark.SparkException: Application
application_1650425060534_0058 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1269)
at
org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1627)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/04/21 16:49:06 INFO ShutdownHookManager: Shutdown hook called
22/04/21 16:49:06 INFO ShutdownHookManager: Deleting directory
/tmp/spark-9bcb4b28-3375-4449-a996-cee322018ce5
22/04/21 16:49:06 INFO ShutdownHookManager: Deleting directory
/tmp/spark-4916cb1f-205f-406c-b8be-68af58250cd2
The command is:
export HADOOP_CONF_DIR=/usr/hdp/3.1.4.0-315/hadoop/conf &&
/usr/hdp/3.1.4.0-315/spark2/bin/spark-submit --class
org.apache.kylin.common.util.SparkEntry --name "Convert Cuboid Data to HFile"
--conf spark.executor.instances=40 --conf
spark.yarn.archive=hdfs:///kylin/spark/spark-libs.jar --conf
spark.yarn.queue=default --conf
spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf
spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec --conf
spark.master=yarn --conf spark.hadoop.yarn.timeline-service.enabled=false
--conf spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf
spark.eventLog.dir=hdfs:///kylin/spark-history --conf
spark.yarn.executor.memoryOverhead=1024 --conf spark.driver.memory=2G --conf
spark.submit.deployMode=cluster --conf spark.shuffle.service.enabled=true
--jars
/usr/hdp/3.1.4.0-315/hbase/lib/hbase-common-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-mapreduce-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-client-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-protocol-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-hadoop-compat-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/htrace-core-3.2.0-incubating.jar,/data/workspace_tools/kylin/kylin-3.1.3/tomcat/webapps/kylin/WEB-INF/lib/metrics-core-2.2.0.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-hadoop-compat-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-hadoop2-compat-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-server-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-shaded-miscellaneous-2.2.0.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-metrics-api-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-metrics-2.0.2.3.1.4.0-315.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-shaded-protobuf-2.2.0.jar,/usr/hdp/3.1.4.0-315/hbase/lib/hbase-protocol-shaded-2.0.2.3.1.4.0-315.jar,
/data/workspace_tools/kylin/kylin-3.1.3/lib/kylin-job-3.1.3.jar -className
org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions
hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/cube_demo/rowkey_stats/part-r-00000_hfile
-counterOutput
hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/cube_demo/counter
-cubename cube_demo -output
hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/cube_demo/hfile
-input
hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/cube_demo/cuboid/
-segmentId f98532ac-3725-cfce-d48e-6c6013c50245 -metaUrl
kylin_metadata@hdfs,path=hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/cube_demo/metadata
-hbaseConfPath
hdfs://testcluster/kylin/kylin_metadata/kylin-f0c095d3-d6d7-fc01-b694-c187dc79c8ea/hbase-conf.xml
at
org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:405)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
... 6 more
2022-04-21 16:49:10,102 INFO [FetcherRunner 376169693-51]
threadpool.DefaultFetcherRunner:117 : Job Fetcher: 0 should running, 0 actual
running, 0 stopped, 0 ready, 2 already succeed, 1 error, 0 discarded, 0 others
2022-04-21 16:49:20,498 INFO [BadQueryDetector] service.BadQueryDetector:148 :
Detect bad query