Jeff, Thanks a lot for pointing to where to look at. Turns out hive-exec.jar was not getting copied, because there were formatting errors in tez.aux.uris.
The tar.gz however got copied, but was not expanded and thus could not be used. Is it correct to assume that for tez.aux.uris, only "jar" files need to be provided ? Thanks Raajay On Fri, Sep 11, 2015 at 4:38 AM, Jianfeng (Jeff) Zhang < [email protected]> wrote: > > You may try the following steps to check the jars your tez job is using > > - Set "yarn.nodemanager.delete.debug-delay-sec” to a value such as > 1200 so that container launch data won’t be deleted after app finished. > - Run you tez job again > - And find yarn.nodemanager.local-dirs in yarn-site.xml (if no, by > default it is ${hadoop.tmp.dir}/nm-local-dir) > - Then check whether hive-exec.jar exist in directory { > yarn.nodemanager.local-dirs}/usercache/{user}/appcache/{appid}/{tez_am > _containerId}/ > > > > > Best Regard, > Jeff Zhang > > > From: Raajay <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, September 11, 2015 at 3:56 PM > To: "[email protected]" <[email protected]> > Subject: Re: Missing libraries. > > Yeah. I added the hive-exec.jar that contains HiveSpltGenerator to HDFS. I > still hit the exception > > On Fri, Sep 11, 2015 at 2:43 AM, Jianfeng (Jeff) Zhang < > [email protected]> wrote: > >> >> Have you try using jar rather than tar.gz ? >> >> >> Best Regard, >> Jeff Zhang >> >> >> From: Raajay <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Friday, September 11, 2015 at 3:15 PM >> To: "[email protected]" <[email protected]> >> Subject: Missing libraries. >> >> I am running DAGs generated by Hive for Tez in offline mode; as in I >> store the DAGs to disk and then run them later using my own Tez Client. >> >> I have been able to get this setup going in local mode. However, while >> running on the cluster, I hit Processor class not found exception (snippet >> below). I figure this is because, custom processor classes defined in Hive >> (eg: HiveSplitGenerator) is not visible while executing a mapper. >> >> I have uploaded, hive exec jar (apache-hive-2.0.0-SNAPSHOT-bin.tar.gz) to >> HDFS and pointed ${tez.aux.uris} to that location. Not sure what more is >> needed to make hive Classes visible to tez tasks ? "tar.gz" does not work ? >> >> >> 2015-09-11 00:59:02,973 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Recovered Vertex State, vertexId=vertex_1441949856963_0006_1_02 [Map 1], >> state=NEW, numInitedSourceVertices=0, numStartedSourceVertices=0, >> numRecoveredSourceVertices=0, recoveredEvents=0, tasksIsNull=false, >> numTasks=0 >> 2015-09-11 00:59:02,974 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Root Inputs exist for Vertex: Map 4 : {a={InputName=a}, >> {Descriptor=ClassName=org.apache.tez.mapreduce.input.MRInputLegacy, >> hasPayload=true}, >> {ControllerDescriptor=ClassName=org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator, >> hasPayload=false}} >> 2015-09-11 00:59:02,974 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Starting root input initializer for input: a, with class: >> [org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator] >> 2015-09-11 00:59:02,974 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Setting vertexManager to RootInputVertexManager for >> vertex_1441949856963_0006_1_00 [Map 4] >> 2015-09-11 00:59:02,979 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Num tasks is -1. Expecting VertexManager/InputInitializers/1-1 split to set >> #tasks for the vertex vertex_1441949856963_0006_1_00 [Map 4] >> 2015-09-11 00:59:02,979 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Vertex will initialize from input initializer. >> vertex_1441949856963_0006_1_00 [Map 4] >> 2015-09-11 00:59:02,980 INFO [Dispatcher thread: Central] impl.VertexImpl: >> Vertex will initialize via inputInitializers vertex_1441949856963_0006_1_00 >> [Map 4]. Starting root input initializers: 1 >> 2015-09-11 00:59:02,981 ERROR [Dispatcher thread: Central] >> common.AsyncDispatcher: Error in dispatcher thread >> org.apache.tez.dag.api.TezUncheckedException: Unable to load class: >> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator >> at >> org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45) >> at >> org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96) >> at >> org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:137) >> at >> org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:114) >> >> >> >
