Hi folks, Looking into the Coordinator Rerun logic, it looks like rerunning a coordinator action resets its external id (which maps to workflow job) and external status. This means it will run a fresh workflow job which explains why the client method getRerun() was returning '0'.
For using the max-rerun limit, you can use OozieClient.rerun() method itself and supply to it the workflow job-id obtained from coordinator action's externalId. Thanks, Mona On 2/10/13 11:12 PM, "Shangzhong zhu" <[email protected]> wrote: >Thanks, Alejandro. > >For the WF rerun count you mentioned, is it >org.apache.oozie.clien.WorkflowJob.getReRun()? > >However, it seems always return 0 no matter how many rerun I made by using >Coordinator Rerun. > >Basically, I am using coordinator rerun: OozieClient.coordReRun() to rerun >failed coordinator actions/workflows. But I want to control the number of >reruns, say maximum 3 reruns. > >Thanks, >Shanzhong > > >On Mon, Feb 4, 2013 at 4:17 PM, Alejandro Abdelnur ><[email protected]>wrote: > >> On Sat, Feb 2, 2013 at 11:05 PM, Shangzhong zhu <[email protected]> >> wrote: >> >> > Hi All, >> > >> > We are developing a wrapper on top of oozie to automate failed/killed >> > coordinator action rerun. >> > >> > To rerun a coordinator action, seems I have two options. >> > >> > 1. Using coordinator action rerun: >> > oozie job -rerun <coord_Job_id> <-date XXXX> >> > >> > 2. Since the failed action is a workflow job, I can also rerun that >> > workflow job by setting oozie.wf.rerun.failnodes to rerun from the >>failed >> > action. >> > >> > Questions: >> > 0. which option is preferred? >> > >> > 1. For option 1, can I choose to rerun from the failed action like the >> > oozie.wf.rerun.failnodes option in workflow rerun? >> > >> > If I recall correctly you cannot do this. >> >> >> > 2. For option 1, seems I cannot change the job configurations. But for >> > option 2, I have more flexibility in changing the configurations, say >>I >> can >> > change the job name so that I know how many rerun has been made for >>that >> > workflow. >> > >> > no need for this, there is a WF rerun count. >> >> >> > 3. If I chose option 2, does it mean that the rerun workflow job is >>not >> > part of the coordinator actions any more? In another word, if I killed >> that >> > coordinator job, that rerun workflow job will be still running? >> >> >> It should get killed as well as the WF job ID is still the same as. >> >> Wit Option #2 though I'm not sure what will happen with the status of >>the >> corresponding COORD action. >> >> >> >> > >> > >> Thanks >> > >> >> >> >> -- >> Alejandro >>
