I got this resolved by adding some properties in hive-site.xml

<property>
        <name>hive.optimize.correlation</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.auto.convert.join.use.nonstaged</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.optimize.bucketmapjoin</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.optimize.bucketmapjoin.sortedmerge</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.exec.max.created.files</name>
        <value>1000000</value>
    </property>
    <property>
        <name>hive.exec.max.dynamic.partitions</name>
        <value>100000</value>
    </property>

I believe that were needed for some optimization on Hive side for JOIN case.

Thanks,
-Nirmal

-----Original Message-----
From: Oussama Chougna [mailto:[email protected]]
Sent: Tuesday, September 29, 2015 4:13 PM
To: [email protected]
Subject: RE: Unable to run JOIN query from Hive Action in Oozie

Ok, try this:
1. Lookup the job id that was assigned to you oozie action. Should be something 
like job_XXXXXXX_XXX. The best tool for this is the job history server of 
hadoop. All jobs are listed there.2. Next on the command line on one of your 
nodes type "yarn logs -applicationId <job -id from step 1>"
This will print out all logs.


> From: [email protected]
> To: [email protected]
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
> Date: Tue, 29 Sep 2015 10:27:56 +0000
>
> I am using YARN.
>
> -Nirmal
>
> -----Original Message-----
> From: Oussama Chougna [mailto:[email protected]]
> Sent: Tuesday, September 29, 2015 3:32 PM
> To: [email protected]
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
>
> Hi Nirmal,
> It is often hard to find useful log info when running oozie. What processing 
> framework are you using MRv1 or YARN?
> Best,
> Oussama
> > From: [email protected]
> > To: [email protected]
> > Subject: Unable to run JOIN query from Hive Action in Oozie
> > Date: Tue, 29 Sep 2015 09:18:36 +0000
> >
> > Hi All,
> >
> > I am unable to run a JOIN query via Hive Action however the same JOIN query 
> > is running from the Hive CLI.
> > I tried a simple HQL via the Hive Action and was able to run the same from 
> > Oozie workflow.
> >
> > I am using:
> >
> > *         oozie-4.1.0
> >
> > *         Hive 1.0.0
> >
> > *         Hadoop 2.6.0
> >
> >
> > Here is snippet of my Hive Action:
> > <action name="WF_4">
> >         <hive xmlns="uri:oozie:hive-action:0.2">
> >             <job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
> >             <name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
> >             <job-xml>hdfs:// 
> > xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
> >             <configuration>
> >                 <property>
> >                     <name>mapred.job.queue.name</name>
> >                     <value>default</value>
> >                 </property>
> >             </configuration>
> >             <script>testhql.q</script>
> >         </hive>
> >         <ok to="WF_5"/>
> >         <error to=" Failure "/>
> > </action>
> >
> >
> > I am getting the following logs on Hadoop:
> >
> > 1899 [main] INFO  org.apache.hadoop.hive.ql.Driver  - Total jobs = 1
> > 1900 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873
> > duration=1172 from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > <PERFLOG
> > method=task.MAPREDLOCAL.Stage-5
> > from=org.apache.hadoop.hive.ql.Driver>
> > 1909 [main] INFO  org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > - Generating plan file
> > file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_
> > 20 -16-54_706_5027639748153132622-1/-local-10005/plan.xml
> > 1909 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > <PERFLOG method=serializePlan
> > from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 1909 [main] INFO  org.apache.hadoop.hive.ql.exec.Utilities  -
> > Serializing MapredLocalWork via kryo
> > 1975 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - </PERFLOG 
> > method=serializePlan start=1443451615882 end=1443451615948 duration=66 
> > from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 2079 [main] INFO  org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask  - 
> > Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar 
> > /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar
> >  org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan 
> > file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml
> >    -jobconffile 
> > file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task  - Execution
> > failed with exit status: 1
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task  - Obtaining
> > error information
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task  - Task failed!
> > Task ID:
> >   Stage-5
> >
> > Logs:
> >
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > - Execution failed with exit status: 1
> > 12291 [main] ERROR org.apache.hadoop.hive.ql.Driver  - FAILED:
> > Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > 12291 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > </PERFLOG method=Driver.execute start=1443451615872
> > end=1443451626264
> > duration=10392 from=org.apache.hadoop.hive.ql.Driver>
> > 12291 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12417 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390
> > duration=126 from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  -
> > </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422
> > duration=0 from=org.apache.hadoop.hive.ql.Driver>
> >
> > <<< Invocation of Hive command completed <<<
> >
> > Hadoop Job IDs executed by Hive:
> >
> > Intercepting System.exit(1)
> >
> > <<< Invocation of Main class completed <<<
> >
> > Failing Oozie Launcher, Main class
> > [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
> >
> > Oozie Launcher failed, finishing Hadoop job gracefully
> >
> > Oozie Launcher, uploading action data to HDFS sequence file:
> > hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-15092514174
> > 49 78-oozie-root-W/WF9--hive/action-data.seq
> >
> > Oozie Launcher ends
> >
> > I am not able to get any useful info from the logs.
> > Not preety sure if this is somewhat related to some Hive related config 
> > settings because have not done any.
> > Any pointers will be great.
> >
> > Thanks,
> > -Nirmal
> >
> >
> > ________________________________
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential, 
> > proprietary, privileged or otherwise protected by law. The message is 
> > intended solely for the named addressee. If received in error, please 
> > destroy and notify the sender. Any use of this email is prohibited when 
> > received in error. Impetus does not represent, warrant and/or guarantee, 
> > that the integrity of this communication has been maintained nor that the 
> > communication is free of errors, virus, interception or interference.
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, proprietary, 
> privileged or otherwise protected by law. The message is intended solely for 
> the named addressee. If received in error, please destroy and notify the 
> sender. Any use of this email is prohibited when received in error. Impetus 
> does not represent, warrant and/or guarantee, that the integrity of this 
> communication has been maintained nor that the communication is free of 
> errors, virus, interception or interference.


________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Reply via email to