Hi there, I'm trying to build cube with Spark. I've followed the instructions in this tutorial: Build Cube with Spark <http://kylin.apache.org/docs/tutorial/cube_spark.html>. But the job failed at step: "Convert Cuboid Data to HFile". The logs shows "ClassNotFoundException: org.apache.hadoop.hbase.io.MetricsIOWrapper"
org.apache.kylin.engine.spark.exception.SparkException: OS command error > exit with return code: 1, error message: 2019-12-12 15:10:46 WARN > SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' > has been deprecated as of Spark 2.3 and may be removed in the future. > Please use the new key 'spark.executor.memoryOverhead' instead. > 2019-12-12 15:10:48 INFO RMProxy:98 - Connecting to ResourceManager at > host/10.16.15.15:9032 > 2019-12-12 15:10:48 INFO Client:54 - Requesting a new application from > cluster with 150 NodeManagers > 2019-12-12 15:10:48 INFO Client:54 - Verifying our application has not > requested more than the maximum memory capability of the cluster (49150 MB > per container) > 2019-12-12 15:10:48 INFO Client:54 - Will allocate AM container, with > 2432 MB memory including 384 MB overhead > 2019-12-12 15:10:48 INFO Client:54 - Setting up container launch context > for our AM > 2019-12-12 15:10:48 INFO Client:54 - Setting up the launch environment > for our AM container > 2019-12-12 15:10:48 INFO Client:54 - Preparing resources for our AM > container > 2019-12-12 15:10:50 INFO Client:54 - Source and destination file systems > are the same. Not copying hdfs://ns1/apps/spark2/spark2.3.2-libs.jar > 2019-12-12 15:10:50 INFO Client:54 - Uploading resource > file:/usr/local/kylin/lib/kylin-job-2.6.4.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/kylin-job-2.6.4.jar > 2019-12-12 15:10:51 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-common-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-common-1.1.5.jar > 2019-12-12 15:10:51 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-server-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-server-1.1.5.jar > 2019-12-12 15:10:51 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-client-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-client-1.1.5.jar > 2019-12-12 15:10:51 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-protocol-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-protocol-1.1.5.jar > 2019-12-12 15:10:51 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-hadoop-compat-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-hadoop-compat-1.1.5.jar > 2019-12-12 15:10:52 INFO Client:54 - Uploading resource > file:/usr/local/hive-1.2.1/lib/hive-jdbc-1.2.1-standalone.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hive-jdbc-1.2.1-standalone.jar > 2019-12-12 15:10:52 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/htrace-core-3.1.0-incubating.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/htrace-core-3.1.0-incubating.jar > 2019-12-12 15:10:52 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/metrics-core-2.2.0.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/metrics-core-2.2.0.jar > 2019-12-12 15:10:52 WARN Client:66 - Same path resource > file:///usr/local/hbase-1.1.5/lib/hbase-hadoop-compat-1.1.5.jar added > multiple times to distributed cache. > 2019-12-12 15:10:52 INFO Client:54 - Uploading resource > file:/usr/local/hbase-1.1.5/lib/hbase-hadoop2-compat-1.1.5.jar -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/hbase-hadoop2-compat-1.1.5.jar > 2019-12-12 15:10:53 INFO Client:54 - Uploading resource > file:/tmp/spark-68d94435-5aa0-4560-8e1a-9fc9fbe00409/__spark_conf__6376154711081920315.zip > -> > hdfs://ns1/user/hadoop/.sparkStaging/application_1549925837808_10474661/__spark_conf__.zip > 2019-12-12 15:10:54 INFO SecurityManager:54 - Changing view acls to: > hadoop > 2019-12-12 15:10:54 INFO SecurityManager:54 - Changing modify acls to: > hadoop > 2019-12-12 15:10:54 INFO SecurityManager:54 - Changing view acls groups > to: > 2019-12-12 15:10:54 INFO SecurityManager:54 - Changing modify acls groups > to: > 2019-12-12 15:10:54 INFO SecurityManager:54 - SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(hadoop); groups with view permissions: Set(); users with modify > permissions: Set(hadoop); groups with modify permissions: Set() > 2019-12-12 15:10:54 INFO Client:54 - Submitting application > application_1549925837808_10474661 to ResourceManager > 2019-12-12 15:10:54 INFO YarnClientImpl:273 - Submitted application > application_1549925837808_10474661 > 2019-12-12 15:10:55 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:10:55 INFO Client:54 - > client token: N/A > diagnostics: [Thu Dec 12 15:10:55 +0800 2019] Application is added to the > scheduler and is not yet activated. (Resource request: <memory:3072, > vCores:1> exceeds current queue or its parents maximum resource allowed). > ApplicationMaster host: N/A > ApplicationMaster RPC port: -1 > queue: root.hadoop > start time: 1576134654497 > final status: UNDEFINED > tracking URL: > http://host.example.com:8981/proxy/application_1549925837808_10474661/ > user: hadoop > 2019-12-12 15:10:56 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:10:57 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:10:58 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:10:59 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:00 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:01 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:02 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:03 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:04 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:05 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:06 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:07 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:08 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:09 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:10 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:11 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:11 INFO Client:54 - > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.16.15.244 > ApplicationMaster RPC port: 0 > queue: root.hadoop > start time: 1576134654497 > final status: UNDEFINED > tracking URL: > http://host.example.com:8981/proxy/application_1549925837808_10474661/ > user: hadoop > 2019-12-12 15:11:12 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:13 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:14 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:15 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:16 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:17 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:18 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:19 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:20 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:21 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:22 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:23 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:24 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:25 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:26 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:27 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:28 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:29 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:30 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:31 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:32 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:32 INFO Client:54 - > client token: N/A > diagnostics: AM container is launched, waiting for AM container to > Register with RM > ApplicationMaster host: N/A > ApplicationMaster RPC port: -1 > queue: root.hadoop > start time: 1576134654497 > final status: UNDEFINED > tracking URL: > http://host.example.com:8981/proxy/application_1549925837808_10474661/ > user: hadoop > 2019-12-12 15:11:33 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:34 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:35 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:36 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:37 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:38 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:39 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:40 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:41 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: ACCEPTED) > 2019-12-12 15:11:42 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:42 INFO Client:54 - > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.16.15.59 > ApplicationMaster RPC port: 0 > queue: root.hadoop > start time: 1576134654497 > final status: UNDEFINED > tracking URL: > http://host.example.com:8981/proxy/application_1549925837808_10474661/ > user: hadoop > 2019-12-12 15:11:43 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:44 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:45 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:46 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:47 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:48 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:49 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:50 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:51 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:52 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:53 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:54 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:55 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:56 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:57 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:58 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:11:59 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:00 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:01 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:02 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:03 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:04 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:05 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:06 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:07 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:08 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: RUNNING) > 2019-12-12 15:12:09 INFO Client:54 - Application report for > application_1549925837808_10474661 (state: FINISHED) > 2019-12-12 15:12:09 INFO Client:54 - > client token: N/A > diagnostics: User class threw exception: java.lang.RuntimeException: error > execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: > Job aborted. > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721) > Caused by: org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081) > at > org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831) > at > org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238) > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) > ... 6 more > Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task > 0.3 in stage 1.0 (TID 5, host.dns.example.com, executor 15): > org.apache.spark.SparkException: Task failed while writing rows > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/io/MetricsIOWrapper > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:247) > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:194) > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:152) > at > org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) > ... 8 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.io.MetricsIOWrapper > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 16 more > > Driver stacktrace: > at org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) > at scala.Option.foreach(Option.scala:257) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087) > at > org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78) > ... 16 more > Caused by: org.apache.spark.SparkException: Task failed while writing rows > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/io/MetricsIOWrapper > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:247) > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:194) > at > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:152) > at > org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) > ... 8 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.io.MetricsIOWrapper > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 16 more > > ApplicationMaster host: 10.16.15.59 > ApplicationMaster RPC port: 0 > queue: root.hadoop > start time: 1576134654497 > final status: FAILED > tracking URL: > http://host.example.com:8981/proxy/application_1549925837808_10474661/ > user: hadoop > Exception in thread "main" org.apache.spark.SparkException: Application > application_1549925837808_10474661 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1165) > at > org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2019-12-12 15:12:09 INFO ShutdownHookManager:54 - Shutdown hook called > 2019-12-12 15:12:09 INFO ShutdownHookManager:54 - Deleting directory > /tmp/spark-68d94435-5aa0-4560-8e1a-9fc9fbe00409 > 2019-12-12 15:12:09 INFO ShutdownHookManager:54 - Deleting directory > /tmp/spark-b48e440b-4d74-4ecf-b500-38fe7cc542d1 > The command is: > export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.2//etc/hadoop && > /usr/local/spark/bin/spark-submit --class > org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=40 > --conf spark.network.timeout=600 --conf spark.yarn.queue=default --conf > spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf > spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec > --conf spark.dynamicAllocation.enabled=true --conf spark.master=yarn > --conf spark.dynamicAllocation.executorIdleTimeout=300 --conf > spark.hadoop.yarn.timeline-service.enabled=false --conf > spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf > spark.eventLog.dir=hdfs:///kylin/spark-history --conf > spark.dynamicAllocation.minExecutors=1 --conf spark.executor.cores=1 > --conf spark.hadoop.mapreduce.output.fileoutputformat.compress=false > --conf spark.yarn.executor.memoryOverhead=1024 --conf > spark.hadoop.dfs.replication=2 --conf > spark.dynamicAllocation.maxExecutors=1000 --conf spark.driver.memory=2G > --conf spark.submit.deployMode=cluster --conf > spark.shuffle.service.enabled=true --jars > /usr/local/hbase-1.1.5/lib/hbase-common-1.1.5.jar,/usr/local/hbase-1.1.5/lib/hbase-server-1.1.5.jar,/usr/local/hbase-1.1.5/lib/hbase-client-1.1.5.jar,/usr/local/hbase-1.1.5/lib/hbase-protocol-1.1.5.jar,/usr/local/hbase-1.1.5/lib/hbase-hadoop-compat-1.1.5.jar,/usr/local/hive-1.2.1/lib/hive-jdbc-1.2.1-standalone.jar,/usr/local/hbase-1.1.5/lib/htrace-core-3.1.0-incubating.jar,/usr/local/hbase-1.1.5/lib/metrics-core-2.2.0.jar,/usr/local/hbase-1.1.5/lib/hbase-hadoop-compat-1.1.5.jar,/usr/local/hbase-1.1.5/lib/hbase-hadoop2-compat-1.1.5.jar, > /usr/local/kylin/lib/kylin-job-2.6.4.jar -className > org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions > hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/test1/rowkey_stats/part-r-00000_hfile > -counterOutput > hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/test1/counter > -cubename test1 -output > hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/test1/hfile > -input > hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/test1/cuboid/ > -segmentId 999b74f8-672a-37c4-6757-f86ef7db3d60 -metaUrl > kylin_metadata@hdfs,path=hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/test1/metadata > -hbaseConfPath > hdfs://ns1/data/kylin/kylin_metadata/kylin-d80429c4-2b09-9b3b-2fda-329e0fa64b63/hbase-conf.xml > at > org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:347) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > This error message is very like the tips on the tutorial which I mentioned above. I've tried to put hbase-hadoop-compat-1.1.5.jar and hbase-hadoop2-compat-1.1.5.jar under the $KYLIN_HOME/spark/jars. But I found there's no spark folder under the $KYLIN_HOME. This is strange because the tutorial said Kylin has a built-in Spark. Maybe the docs are out-dated? I have to create $KYLIN_HOME/spark/jars folder manually and put these two jars into the "jars" folder. But when after I resume the job, that error happened again. Nothing helps. I've also tried to put these two jars under the $SPARK_HOME/jars. But it still doesn't work. These versions are my current setup: Kylin: apache-kylin-2.6.4-bin-hbase1x.tar.gz Spark: 2.3.2 HBase: 1.1.5 Hadoop: 2.7.2 Thanks, Liang
