It seems that wrong class, HiveInputFormat, is loaded. The stacktrace is way off the current Hive code. You need to build Spark 1.2 and copy spark-assembly jar to Hive's lib directory and that it.
--Xuefu On Mon, Dec 1, 2014 at 6:22 PM, yuemeng1 <yueme...@huawei.com> wrote: > hi,i built a hive on spark package and my spark assembly jar is > spark-assembly-1.2.0-SNAPSHOT-hadoop2.4.0.jar,when i run a query in hive > shell,before execute this query, > i set all the require which hive need with spark.and i execute a join > query : > select distinct st.sno,sname from student st join score sc > on(st.sno=sc.sno) where sc.cno IN(11,12,13) and st.sage > 28; > but it failed, > get follow error in spark webUI: > Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most > recent failure: Lost task 0.3 in stage 1.0 (TID 7, datasight18): > java.lang.NullPointerException+details > > Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most > recent failure: Lost task 0.3 in stage 1.0 (TID 7, datasight18): > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) > at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:233) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > > Driver stacktrace: > > > can u give me a help to deal this probelm,and i think my built was > succussed! >