thanks for your reply .. I just do as your second sugesstion to copy /HBAS-HOME/lib/* to /KYLIN-HOME/spark/jars and overwrite existing jars ...It works for me !!! Thanks alot.. um....I have another doubt thant Kylin embeds a Spark binary (v2.1.0) in $KYLIN_HOME/spark. I have try to replace $KYLIN_HOME/spark with spark-2.2.0 before copy /HBAS-HOME/lib/* to /KYLIN-HOME/spark/jars..when I build my cube there was a different error I would try to copy /HBAS-HOME/lib/* to the new spark to check if it works to spark 2.2.0
Jon Shoberg <[email protected]> 于2018年12月19日周三 上午1:10写道: > I believe the root cause if fixed in a recent jira issue and patch that > will go into a later release. > > Two solutions: > > First, you can look and see which class definitions are missing, locate > their containing JARs, and move them to your spark jars folder. > > For me, this meant moved jar files from /opt/hbase/lib to > /opt/kylin/spark/jars. > > Second, I took an alternate approach which was easier at the time. I moved > -everything- from hbase/lib to spark/jars and then resolved class conflicts > when I got an error message. > > For me, this meant removing an extra *netty* jar which spark had a > conflict but I got a successful spark/kylin build. (remove the netty jar > coming from hbase libs into spark jars) > > I'd say the second approach is extremely sub-optimal but I'm working in a > test-lab setup and it unblocked an issue (got spark builds working) and let > me move forward. > > Also .... > > At the same time I got errors regarding kylin.properties not being found. > Since this lab setup is established from TAR downloads I needed to copy my > kylin configuration to each node (same path/directory structure) > > Not sure if this was before or after the above item but the hbase jars and > distributing the conf files got kylin/spark working for my small data set; > working on optimizing the medium data set now. > > Best of luck! J > > > > > > > > > > > > > > On Tue, Dec 18, 2018 at 9:30 AM smallsuperman <[email protected]> > wrote: > >> Hello all, >> I usesd Apache Spark to replace MapReduce in the build cube step as >> document at http://kylin.apache.org/docs/tutorial/cube_spark.html . But >> the build job was failed at step 8 named Convert Cuboid Data to HFile and >> the log file output is >> >> OS command error exit with return code: 1, error message: 18/12/18 >> 23:31:53 INFO client.RMProxy: Connecting to ResourceManager at iap12m6/ >> 10.8.245.41:8032 >> 18/12/18 23:31:53 INFO yarn.Client: Requesting a new application from >> cluster with 3 NodeManagers >> 18/12/18 23:31:53 INFO yarn.Client: Verifying our application has not >> requested more than the maximum memory capability of the cluster (8192 MB >> per container) >> 18/12/18 23:31:53 INFO yarn.Client: Will allocate AM container, with 1408 >> MB memory including 384 MB overhead >> 18/12/18 23:31:53 INFO yarn.Client: Setting up container launch context >> for our AM >> 18/12/18 23:31:53 INFO yarn.Client: Setting up the launch environment for >> our AM container >> 18/12/18 23:31:53 INFO yarn.Client: Preparing resources for our AM >> container >> 18/12/18 23:31:54 WARN yarn.Client: Neither spark.yarn.jars nor >> spark.yarn.archive is set, falling back to uploading libraries under >> SPARK_HOME. >> I also check the error log at yarn >> >> Diagnostics: >> User class threw exception: java.lang.RuntimeException: error execute >> org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job >> aborted due to stage failure: Task 1 in stage 1.0 failed 4 times, most >> recent failure: Lost task 1.3 in stage 1.0 (TID 15, iap12m8, executor 3): >> java.lang.NoClassDefFoundError: Could not initialize class >> org.apache.hadoop.hbase.io.hfile.HFile >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1125) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123) >> at >> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1353) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1131) >> at >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) >> at org.apache.spark.scheduler.Task.run(Task.scala:99) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> Driver stacktrace: >> and I think java.lang.NoClassDefFoundError: Could not initialize class >> org.apache.hadoop.hbase.io.hfile.HFile >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) >> is the most import info, however I have no idea and find few useful >> sugesstion at Internet… >> >> here is my environment >> >> hadoop-2.7.3 >> hbase-1.4.9 >> hive-1.2.1 >> kylin-2.5.2-bin-hbase1x >> jdk1.8.0_144 >> spark-2.2.0 >> hope your helps thanks.. >> >
