While this is almost often caused by network comm. or setup issues between Oozie and JT hosts, I wonder if we should consider lowering that default of 1m to something lesser?
On Tue, Sep 24, 2013 at 1:40 PM, Mohammad Islam <[email protected]> wrote: > Most possibly, Hadoop JT doesn't callback oozie when hadoop job finishes - as > designed . Oozie , as a fallback option, checks the hadoop job status in > every 5 minutes (by default, but configurable). > > There are two independent things you can do: > 1. Find out why JT is not sending the callback. Or it could be Oozie is > dropping it. Please check for something like this in oozie log "callback for > action". > > 2. Decrease the auto-retry interval from 5 minutes to 1 minutes (say). Add > the following into ooze-site.xml and restart Oozie service. > <property> > <name>oozie.service.ActionCheckerService.action.check.delay</name> > <value>60</value> > <description> > The time, in seconds, between an ActionCheck for the same action. > </description> > </property> > > Regards, > Mohammad > > > ________________________________ > From: Cuong Luu <[email protected]> > To: [email protected] > Sent: Monday, September 23, 2013 9:30 PM > Subject: Take long time to finish a action (status code: Running -> Finished) > > > Hi all, > > There are 2 simple actions in my work-flow. I see that oozie takes >8 > minutes to finish first action (status code: running -> finished) although > it takes only 3 seconds in job tracker hadoop site. > > Are there any config variable to fix it? > > > My sample work-flow: > > <start to="java-node" /> > <action name="java-node"> > <java> > <job-tracker>${jobTracker}</job-tracker> > <name-node>${nameNode}</name-node> > <prepare> > <delete path="${nameNode}/${hadoop.tmp.dir}" /> > </prepare> > <configuration> > <property> > <name>mapred.job.queue.name</name> > <value>${queueName}</value> > </property> > </configuration> > <main-class>org.ltc.command.AllFileUrlsOnText</main-class> > <arg>${raw.data.set}</arg> > </java> > <ok to="log-node-start" /> > <error to="fail" /> > </action> > > <action name="log-node-start"> > <java> > <job-tracker>${jobTracker}</job-tracker> > <name-node>${nameNode}</name-node> > <configuration> > <property> > <name>mapred.job.queue.name</name> > <value>${queueName}</value> > </property> > </configuration> > <main-class>org.ltc.command.LogMain</main-class> > <arg>Log From Ozzie </arg> > </java> > <ok to="end" /> > <error to="fail" /> > ... > </action> -- Harsh J
