Hello, Following the Hive wiki page, https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started, I got a several fails that execute HQL based on Spark engine with yarn. I have hadoop-2.6.2, yarn-2.6.2 and Spark-1.5.2. The fails got either spark-1.5.2-hadoop2.6 distribution version or spark-1.5.2-without-hive customer compiler version with instruction on that wiki page.
Hive cli submits spark job but the job runs a short time and RM web app shows the job is successfully. but hive cli show the job fails. Here is a snippet of hive cli debug log. any suggestion? 15/11/30 07:31:36 [main]: INFO status.SparkJobMonitor: state = SENT 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO yarn.Client: Application report for application_1448886638370_0001 (state: RUNNING) 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO yarn.Client: 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: client token: N/A 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: diagnostics: N/A 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: ApplicationMaster host: 192.168.1.12 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: ApplicationMaster RPC port: 0 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: queue: default 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: start time: 1448886649489 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: final status: UNDEFINED 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: tracking URL: http://namenode.localdomain:8088/proxy/application_1448886638370_0001/ 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: user: hadoop 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO cluster.YarnClientSchedulerBackend: Application application_1448886638370_0001 has started running. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51326. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO netty.NettyBlockTransferService: Server created on 51326 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.1.10:51326 with 66.8 MB RAM, BlockManagerId(driver, 192.168.1.10, 51326) 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMaster: Registered BlockManager state = SENT 15/11/30 07:31:37 [main]: INFO status.SparkJobMonitor: state = SENT 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type java.lang.Integer (2 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=0 payload=java.lang.Integer 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO spark.SparkContext: Added JAR file:/home/hadoop/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar at http://192.168.1.10:41276/jars/hive-exec-1.2.1.jar with timestamp 1448886697575 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$NullMessage (2 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=1 payload=org.apache.hive.spark.client.rpc.Rpc$NullMessage 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO rpc.RpcDispatcher: [DriverProtocol] Closing channel due to exception in pipeline (java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job). 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type java.lang.String (3720 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=ERROR id=2 payload=java.lang.String 15/11/30 07:31:37 [RPC-Handler-3]: WARN rpc.RpcDispatcher: Received error message:io.netty.handler.codec.DecoderException: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:358) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230) at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:760) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.java:96) at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:42) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:327) ... 15 more Caused by: java.lang.ClassNotFoundException: org.apache.hive.spark.client.Job at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 39 more . 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 WARN client.RemoteDriver: Shutting down driver because RPC channel was closed. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO client.RemoteDriver: Shutting down remote driver. 15/11/30 07:31:37 [RPC-Handler-3]: WARN client.SparkClientImpl: Client RPC channel closed unexpectedly. best regards, Link Qian
