Thanks, Alejandro. For the WF rerun count you mentioned, is it org.apache.oozie.clien.WorkflowJob.getReRun()?
However, it seems always return 0 no matter how many rerun I made by using Coordinator Rerun. Basically, I am using coordinator rerun: OozieClient.coordReRun() to rerun failed coordinator actions/workflows. But I want to control the number of reruns, say maximum 3 reruns. Thanks, Shanzhong On Mon, Feb 4, 2013 at 4:17 PM, Alejandro Abdelnur <[email protected]>wrote: > On Sat, Feb 2, 2013 at 11:05 PM, Shangzhong zhu <[email protected]> > wrote: > > > Hi All, > > > > We are developing a wrapper on top of oozie to automate failed/killed > > coordinator action rerun. > > > > To rerun a coordinator action, seems I have two options. > > > > 1. Using coordinator action rerun: > > oozie job -rerun <coord_Job_id> <-date XXXX> > > > > 2. Since the failed action is a workflow job, I can also rerun that > > workflow job by setting oozie.wf.rerun.failnodes to rerun from the failed > > action. > > > > Questions: > > 0. which option is preferred? > > > > 1. For option 1, can I choose to rerun from the failed action like the > > oozie.wf.rerun.failnodes option in workflow rerun? > > > > If I recall correctly you cannot do this. > > > > 2. For option 1, seems I cannot change the job configurations. But for > > option 2, I have more flexibility in changing the configurations, say I > can > > change the job name so that I know how many rerun has been made for that > > workflow. > > > > no need for this, there is a WF rerun count. > > > > 3. If I chose option 2, does it mean that the rerun workflow job is not > > part of the coordinator actions any more? In another word, if I killed > that > > coordinator job, that rerun workflow job will be still running? > > > It should get killed as well as the WF job ID is still the same as. > > Wit Option #2 though I'm not sure what will happen with the status of the > corresponding COORD action. > > > > > > > > Thanks > > > > > > -- > Alejandro >
