While this is almost often caused by network comm. or setup issues
between Oozie and JT hosts, I wonder if we should consider lowering
that default of 1m to something lesser?

On Tue, Sep 24, 2013 at 1:40 PM, Mohammad Islam <[email protected]> wrote:
> Most possibly, Hadoop JT doesn't callback oozie when hadoop job finishes - as 
> designed . Oozie , as a fallback option, checks the hadoop job status in 
> every 5 minutes (by default, but configurable).
>
> There are two independent things you can do:
> 1. Find out why JT is not sending the callback. Or it could be Oozie is 
> dropping it. Please check for something like this in oozie log "callback for 
> action".
>
> 2. Decrease the auto-retry interval from 5 minutes to 1 minutes (say). Add 
> the following into ooze-site.xml and restart Oozie service.
>    <property>
>         <name>oozie.service.ActionCheckerService.action.check.delay</name>
>         <value>60</value>
>         <description>
>             The time, in seconds, between an ActionCheck for the same action.
>         </description>
>     </property>
>
>  Regards,
> Mohammad
>
>
> ________________________________
>  From: Cuong Luu <[email protected]>
> To: [email protected]
> Sent: Monday, September 23, 2013 9:30 PM
> Subject: Take long time to finish a action (status code: Running -> Finished)
>
>
> Hi all,
>
> There are 2 simple actions in my work-flow. I see that oozie takes >8
> minutes to finish first action (status code: running -> finished) although
> it takes only 3 seconds in job tracker hadoop site.
>
> Are there any config variable to fix it?
>
>
> My sample work-flow:
>
> <start to="java-node" />
>     <action name="java-node">
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <prepare>
>                 <delete path="${nameNode}/${hadoop.tmp.dir}" />
>             </prepare>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>org.ltc.command.AllFileUrlsOnText</main-class>
>             <arg>${raw.data.set}</arg>
>         </java>
>         <ok to="log-node-start" />
>         <error to="fail" />
>     </action>
>
>     <action name="log-node-start">
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>org.ltc.command.LogMain</main-class>
>             <arg>Log From Ozzie </arg>
>         </java>
>         <ok to="end" />
>         <error to="fail" />
>         ...
>     </action>



-- 
Harsh J

Reply via email to