Hi folks,

Looking into the Coordinator Rerun logic, it looks like rerunning a
coordinator action resets its external id (which maps to workflow job) and
external status. This means it will run a fresh workflow job which
explains why the client method getRerun() was returning '0'.

For using the max-rerun limit, you can use OozieClient.rerun() method
itself and supply to it the workflow job-id obtained from coordinator
action's externalId.

Thanks,

Mona

On 2/10/13 11:12 PM, "Shangzhong zhu" <[email protected]> wrote:

>Thanks, Alejandro.
>
>For the WF rerun count you mentioned, is it
>org.apache.oozie.clien.WorkflowJob.getReRun()?
>
>However, it seems always return 0 no matter how many rerun I made by using
>Coordinator Rerun.
>
>Basically, I am using coordinator rerun: OozieClient.coordReRun() to rerun
>failed coordinator actions/workflows. But I want to control the number of
>reruns, say maximum 3 reruns.
>
>Thanks,
>Shanzhong
>
>
>On Mon, Feb 4, 2013 at 4:17 PM, Alejandro Abdelnur
><[email protected]>wrote:
>
>> On Sat, Feb 2, 2013 at 11:05 PM, Shangzhong zhu <[email protected]>
>> wrote:
>>
>> > Hi All,
>> >
>> > We are developing a wrapper on top of oozie to automate failed/killed
>> > coordinator action rerun.
>> >
>> > To rerun a coordinator action, seems I have two options.
>> >
>> > 1. Using coordinator action rerun:
>> >      oozie job -rerun <coord_Job_id> <-date XXXX>
>> >
>> > 2. Since the failed action is a workflow job, I can also rerun that
>> > workflow job by setting oozie.wf.rerun.failnodes to rerun from the
>>failed
>> > action.
>> >
>> > Questions:
>> > 0. which option is preferred?
>> >
>> > 1. For option 1, can I choose to rerun from the failed action like the
>> > oozie.wf.rerun.failnodes option in workflow rerun?
>> >
>> > If I recall correctly you cannot do this.
>>
>>
>> > 2. For option 1, seems I cannot change the job configurations. But for
>> > option 2, I have more flexibility in changing the configurations, say
>>I
>> can
>> > change the job name so that I know how many rerun has been made for
>>that
>> > workflow.
>> >
>> > no need for this, there is a WF rerun count.
>>
>>
>> > 3. If I chose option 2, does it mean that the rerun workflow job is
>>not
>> > part of the coordinator actions any more? In another word, if I killed
>> that
>> > coordinator job, that rerun workflow job will be still running?
>>
>>
>> It should get killed as well as the WF job ID is still the same as.
>>
>> Wit Option #2 though I'm not sure what will happen with the status of
>>the
>> corresponding COORD action.
>>
>>
>>
>> >
>> >
>> Thanks
>> >
>>
>>
>>
>> --
>> Alejandro
>>

Reply via email to