Hi Xuefu Zhang, I just tried again Hive with Spark after a long time. When I run queries that do not touch HBase it works fine at cluster mode. The problem occurs when I try to run a query that is on HBase (obviously through Hive)
> On 19 Nov 2015, at 20:54, Xuefu Zhang <xzh...@cloudera.com> wrote: > > Are you able to run queries that are not touching HBase? This problem were > seen before but fixed. > > On Tue, Nov 17, 2015 at 3:37 AM, Sofia <sofia.panagiot...@taiger.com > <mailto:sofia.panagiot...@taiger.com>> wrote: > Hello, > > I have configured Hive to work Spark. > > I have been trying to run a query on a Hive table managing an HBase table > (created via HBaseStorageHandler) at the Hive CLI. > > When spark.master is “local" it works just fine, but when I set it to my > spark master spark://spark-master:707 <>7 I get the following error: > > > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 > 10:49:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from > ShuffleMapStage 0 (MapPartitionsRDD[1] at mapPartitionsToPair at > MapTran.java:31) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 > 10:49:30 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 > 10:49:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID > 0, 192.168.1.64, ANY, 1688 bytes) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 > 10:49:30 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, > 192.168.1.64): java.lang.IllegalStateException: unread block data > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2428) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: at > java.lang.Thread.run(Thread.java:745) > > > I read something about the guava.jar missing but I am not sure how to fix it. > I am using Spark 1.4.1, HBase 1.1.2 and Hive 1.2.1. > Any help more than appreciated. > > Sofia >