Hello all, 
I usesd Apache Spark to replace MapReduce in the build cube step as document at 
http://kylin.apache.org/docs/tutorial/cube_spark.html . But the build job was 
failed at step 8 named Convert Cuboid Data to HFile and the log file output is

OS command error exit with return code: 1, error message: 18/12/18 23:31:53 
INFO client.RMProxy: Connecting to ResourceManager at iap12m6/10.8.245.41:8032
18/12/18 23:31:53 INFO yarn.Client: Requesting a new application from cluster 
with 3 NodeManagers
18/12/18 23:31:53 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (8192 MB per container)
18/12/18 23:31:53 INFO yarn.Client: Will allocate AM container, with 1408 MB 
memory including 384 MB overhead
18/12/18 23:31:53 INFO yarn.Client: Setting up container launch context for our 
AM
18/12/18 23:31:53 INFO yarn.Client: Setting up the launch environment for our 
AM container
18/12/18 23:31:53 INFO yarn.Client: Preparing resources for our AM container
18/12/18 23:31:54 WARN yarn.Client: Neither spark.yarn.jars nor 
spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
I also check the error log at yarn

Diagnostics:    
User class threw exception: java.lang.RuntimeException: error execute 
org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted 
due to stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: 
Lost task 1.3 in stage 1.0 (TID 15, iap12m8, executor 3): 
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.hbase.io.hfile.HFile
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1125)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123)
at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1353)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1131)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
and I think java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.hbase.io.hfile.HFile 
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
 is the most import info, however I have no idea and find few useful sugesstion 
at Internet…

here is my environment

hadoop-2.7.3
hbase-1.4.9
hive-1.2.1
kylin-2.5.2-bin-hbase1x
jdk1.8.0_144
spark-2.2.0
hope your helps thanks..

Reply via email to