Hello all, I usesd Apache Spark to replace MapReduce in the build cube step as document at http://kylin.apache.org/docs/tutorial/cube_spark.html . But the build job was failed at step 8 named Convert Cuboid Data to HFile and the log file output is
OS command error exit with return code: 1, error message: 18/12/18 23:31:53 INFO client.RMProxy: Connecting to ResourceManager at iap12m6/10.8.245.41:8032 18/12/18 23:31:53 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 18/12/18 23:31:53 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 18/12/18 23:31:53 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 18/12/18 23:31:53 INFO yarn.Client: Setting up container launch context for our AM 18/12/18 23:31:53 INFO yarn.Client: Setting up the launch environment for our AM container 18/12/18 23:31:53 INFO yarn.Client: Preparing resources for our AM container 18/12/18 23:31:54 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. I also check the error log at yarn Diagnostics: User class threw exception: java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted due to stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost task 1.3 in stage 1.0 (TID 15, iap12m8, executor 3): java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1125) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1353) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1131) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: and I think java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) is the most import info, however I have no idea and find few useful sugesstion at Internet… here is my environment hadoop-2.7.3 hbase-1.4.9 hive-1.2.1 kylin-2.5.2-bin-hbase1x jdk1.8.0_144 spark-2.2.0 hope your helps thanks..
