Re: hive + tez + yarn 2.4

Grandl Robert Wed, 18 Jun 2014 14:06:06 -0700

Hi Hitesh,


I followed the steps mentioned there. The error mentioned above: 

FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask

was mainly that Hive was expecting the hive-exec jar to be in /user/hadoop 
instead of /user/hive as mentioned on some posts. I copied the hive exec in 
that user, and now the TEZ job is launched, succeeds after a long while, but 
DAGs fails.

In Hive console I get the following:
hive> SELECT COUNT(*) FROM student;
Query ID = hadoop_20140618133737_83b94345-4058-4e25-8528-fa0bfded4b86
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (application id: application_1403117117414_0006)

Map 1: -/-    Reducer 2: 0/1    
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1403117117414_0006_1_01, 
diagnostics=[Vertex Input: student initializer failed., 
org.apache.tez.runtime.api.events.RootInputConfigureVertexTasksEvent.<init>(ILjava/util/List;)V]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1403117117414_0006_1_00,
 diagnostics=[Vertex received Kill in INITED state.]
DAG failed due to vertex failure. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask


After the job finished(with failed DAGS), looking in job AM log, in 
stdout_dag_*, I can see the following exception:
2014-06-18 13:37:15,777 INFO [InputInitializer [Map 1] #0] 
org.apache.hadoop.hive.ql.exec.tez.SplitGrouper: Original split size is 56 
grouped split size is 6
, for bucket: 1
2014-06-18 13:37:15,781 INFO [InputInitializer [Map 1] #0] 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator: Number of grouped 
splits: 6
2014-06-18 13:37:15,788 ERROR [AsyncDispatcher event handler] 
org.apache.tez.dag.app.dag.impl.VertexImpl: Vertex Input: student initializer 
failed
java.lang.NoSuchMethodError: 
org.apache.tez.runtime.api.events.RootInputConfigureVertexTasksEvent.<init>(ILjava/util/List;)V
        at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:177)
        at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:92)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:154)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:146)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable.call(RootInputInitializerRunner.java:146)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable.call(RootInputInitializerRunner.java:114)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)
2014-06-18 13:37:15,798 INFO [HistoryEventHandlingThread] 
org.apache.tez.dag.history.logging.impl.SimpleHistoryLoggingService: Writing 
event VERTEX_FINISHED to history file
2014-06-18 13:37:15,800 INFO [AsyncDispatcher event handler] 
org.apache.tez.dag.history.HistoryEventHandler: 
[HISTORY][DAG:dag_1403117117414_0006_1][Event:VERTEX_FINISHED]: vertexName=Map 
1, vertexId=vertex_1403117117414_0006_1_01, initRequestedTime=1403123835325, 
initedTime=0, startRequestedTime=1403123835358, s


I also have tez-site.xml path into HADOOP_CLASSPATH. 

Do you have any idea about it ?

robert

On Wednesday, June 18, 2014 12:54 PM, Hitesh Shah <[email protected]> wrote:
 


Hi Robert, 

The 2.0.4 docs are quite old as they seem to be referring to a very old release 
of Tez. The relevant docs should be 
"http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez.html”
 

The minimal changes that you need to do are the following: 
   - follow the basic steps of
 setting up tez
 such as uploading jars to HDFS, creating the tez-site.xml and updating it to 
point to the correct path for the jar on HDFS
   - change HADOOP_CLASSPATH to have the tez jars in the class path on your 
client machine
   - set hive.execution.engine=tez in hive-site.xml or on your hive shell ( you 
can skip the step of uploading hive-exec jar to HDFS for now as its optional )

Also, “yarn-tez” for mapreduce.framewok.name should not be needed for running 
Hive-on-Tez. It is mainly a way to run MapReduce jobs using the Tez execution 
engine. 

thanks
— Hitesh



On Jun 18,
 2014, at 10:24 AM, Grandl Robert <[email protected]> wrote:

> Hi guys,
> 
> I was trying to run hive atop tez atop yarn 2.4. Setting 
> mapreduce.framework.name to yarn-tez enables tez execution engine and I can 
> run the orderedwordcount example which comes along tez. 
> 
> However, I also installed Hive-0.13. Simply running a hive query still uses 
> Tez(because it is enabled with mapreduce.framework.name). However, I am not 
> sure it is completely utilizing Tez API's and stuff. In UI I can see that a 
> Tez application
 is running instead of MapReduce. 
> 
> But
 looking on the web, it seems there are other steps to enable Hive using Tez or 
MapReduce framework:
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.4.0/bk_installing_manually_book/content/rpm-chap-tez-5-4.html
> 
> like setting some HIVE_AUX_JARS_PATH variable, and some properties such as: 
> set hive.use.tez.natively=true;
> set hive.execution.engine=tez; ?
> 
> However, following the steps mentioned
 in the link works only for the case with disable Tez for Hive queries. 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.4.0/bk_installing_manually_book/content/rpm-chap-tez-5-5.html
>  
> Can someone let me know if simply enabling yarn-tez in mapred-site works fine 
> ? Or what is a proper way to enable it ? (Hive -0.13(compiled from trunk), 
> Tez - 0.5(compiled from trunk) and Yarn-24(compiled from trunk). 
> 
> Thanks,
> robert

Re: hive + tez + yarn 2.4

Reply via email to