Thanks a lot guys for your help.
The version mismatch solved my problem. Robert On Wednesday, June 18, 2014 2:19 PM, Bikas Saha <[email protected]> wrote: Hive 0.13 is incompatible with Tez-0.5 (trunk). Hive depends on Tez-0.4. You should probably check out branch-0.4 and build that. Bikas From:Grandl Robert [mailto:[email protected]] Sent: Wednesday, June 18, 2014 2:05 PM To: [email protected] Subject: Re: hive + tez + yarn 2.4 Hi Hitesh, I followed the steps mentioned there. The error mentioned above: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask was mainly that Hive was expecting the hive-exec jar to be in /user/hadoop instead of /user/hive as mentioned on some posts. I copied the hive exec in that user, and now the TEZ job is launched, succeeds after a long while, but DAGs fails. In Hive console I get the following: hive> SELECT COUNT(*) FROM student; Query ID = hadoop_20140618133737_83b94345-4058-4e25-8528-fa0bfded4b86 Total jobs = 1 Launching Job 1 out of 1 Tez session was closed. Reopening... Session re-established. Status: Running (application id: application_1403117117414_0006) Map 1: -/- Reducer 2: 0/1 Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1403117117414_0006_1_01, diagnostics=[Vertex Input: student initializer failed., org.apache.tez.runtime.api.events.RootInputConfigureVertexTasksEvent.<init>(ILjava/util/List;)V] Vertex killed, vertexName=Reducer 2, vertexId=vertex_1403117117414_0006_1_00, diagnostics=[Vertex received Kill in INITED state.] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask After the job finished(with failed DAGS), looking in job AM log, in stdout_dag_*, I can see the following exception: 2014-06-18 13:37:15,777 INFO [InputInitializer [Map 1] #0] org.apache.hadoop.hive.ql.exec.tez.SplitGrouper: Original split size is 56 grouped split size is 6 , for bucket: 1 2014-06-18 13:37:15,781 INFO [InputInitializer [Map 1] #0] org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator: Number of grouped splits: 6 2014-06-18 13:37:15,788 ERROR [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Vertex Input: student initializer failed java.lang.NoSuchMethodError: org.apache.tez.runtime.api.events.RootInputConfigureVertexTasksEvent.<init>(ILjava/util/List;)V at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:177) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:92) at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:154) at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable.call(RootInputInitializerRunner.java:146) at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable.call(RootInputInitializerRunner.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 2014-06-18 13:37:15,798 INFO [HistoryEventHandlingThread] org.apache.tez.dag.history.logging.impl.SimpleHistoryLoggingService: Writing event VERTEX_FINISHED to history file 2014-06-18 13:37:15,800 INFO [AsyncDispatcher event handler] org.apache.tez.dag.history.HistoryEventHandler: [HISTORY][DAG:dag_1403117117414_0006_1][Event:VERTEX_FINISHED]: vertexName=Map 1, vertexId=vertex_1403117117414_0006_1_01, initRequestedTime=1403123835325, initedTime=0, startRequestedTime=1403123835358, s I also have tez-site.xml path into HADOOP_CLASSPATH. Do you have any idea about it ? robert On Wednesday, June 18, 2014 12:54 PM, Hitesh Shah <[email protected]> wrote: Hi Robert, The 2.0.4 docs are quite old as they seem to be referring to a very old release of Tez. The relevant docs should be "http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez.html” The minimal changes that you need to do are the following: - follow the basic steps of setting up tez such as uploading jars to HDFS, creating the tez-site.xml and updating it to point to the correct path for the jar on HDFS - change HADOOP_CLASSPATH to have the tez jars in the class path on your client machine - set hive.execution.engine=tez in hive-site.xml or on your hive shell ( you can skip the step of uploading hive-exec jar to HDFS for now as its optional ) Also, “yarn-tez” for mapreduce.framewok.name should not be needed for running Hive-on-Tez. It is mainly a way to run MapReduce jobs using the Tez execution engine. thanks — Hitesh On Jun 18, 2014, at 10:24 AM, Grandl Robert <[email protected]> wrote: > Hi guys, > > I was trying to run hive atop tez atop yarn 2.4. Setting > mapreduce.framework.name to yarn-tez enables tez execution engine and I can > run the orderedwordcount example which comes along tez. > > However, I also installed Hive-0.13. Simply running a hive query still uses > Tez(because it is enabled with mapreduce.framework.name). However, I am not > sure it is completely utilizing Tez API's and stuff. In UI I can see that a > Tez application is running instead of MapReduce. > > But looking on the web, it seems there are other steps to enable Hive using > Tez or MapReduce framework: > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.4.0/bk_installing_manually_book/content/rpm-chap-tez-5-4.html > > like setting some HIVE_AUX_JARS_PATH variable, and some properties such as: > set hive.use.tez.natively=true; > set hive.execution.engine=tez; ? > > However, following the steps mentioned in the link works only for the case > with disable Tez for Hive queries. > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.4.0/bk_installing_manually_book/content/rpm-chap-tez-5-5.html > > Can someone let me know if simply enabling yarn-tez in mapred-site works fine > ? Or what is a proper way to enable it ? (Hive -0.13(compiled from trunk), > Tez - 0.5(compiled from trunk) and Yarn-24(compiled from trunk). > > Thanks, > robert CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
