Seems like a fair enhancement. Could you please file a JIRA stating these
details?

Thanks,

Mona

On 2/11/13 6:03 PM, "Shangzhong zhu" <[email protected]> wrote:

>Thanks, Mona.
>
>I could rerun the workflow job, but the corresponding coordinator action
>status won't get changed, right? which means the coordinator action will
>still show failed, even if the associated workflow get rerun successfully.
>
>Can we enhance the coordinator rerun to be consistent with the workflow
>rerun?
>* Keep the same workflow ID.
>* Support rerun from beginning or rerun from the failed WF action.
>* Rerun count reflects the number of tries.
>
>Thanks,
>Shanzhong
>
>On Mon, Feb 11, 2013 at 1:11 PM, Mona Chitnis <[email protected]>
>wrote:
>
>> Hi folks,
>>
>> Looking into the Coordinator Rerun logic, it looks like rerunning a
>> coordinator action resets its external id (which maps to workflow job)
>>and
>> external status. This means it will run a fresh workflow job which
>> explains why the client method getRerun() was returning '0'.
>>
>> For using the max-rerun limit, you can use OozieClient.rerun() method
>> itself and supply to it the workflow job-id obtained from coordinator
>> action's externalId.
>>
>> Thanks,
>>
>> Mona
>>
>> On 2/10/13 11:12 PM, "Shangzhong zhu" <[email protected]> wrote:
>>
>> >Thanks, Alejandro.
>> >
>> >For the WF rerun count you mentioned, is it
>> >org.apache.oozie.clien.WorkflowJob.getReRun()?
>> >
>> >However, it seems always return 0 no matter how many rerun I made by
>>using
>> >Coordinator Rerun.
>> >
>> >Basically, I am using coordinator rerun: OozieClient.coordReRun() to
>>rerun
>> >failed coordinator actions/workflows. But I want to control the number
>>of
>> >reruns, say maximum 3 reruns.
>> >
>> >Thanks,
>> >Shanzhong
>> >
>> >
>> >On Mon, Feb 4, 2013 at 4:17 PM, Alejandro Abdelnur
>> ><[email protected]>wrote:
>> >
>> >> On Sat, Feb 2, 2013 at 11:05 PM, Shangzhong zhu <[email protected]>
>> >> wrote:
>> >>
>> >> > Hi All,
>> >> >
>> >> > We are developing a wrapper on top of oozie to automate
>>failed/killed
>> >> > coordinator action rerun.
>> >> >
>> >> > To rerun a coordinator action, seems I have two options.
>> >> >
>> >> > 1. Using coordinator action rerun:
>> >> >      oozie job -rerun <coord_Job_id> <-date XXXX>
>> >> >
>> >> > 2. Since the failed action is a workflow job, I can also rerun that
>> >> > workflow job by setting oozie.wf.rerun.failnodes to rerun from the
>> >>failed
>> >> > action.
>> >> >
>> >> > Questions:
>> >> > 0. which option is preferred?
>> >> >
>> >> > 1. For option 1, can I choose to rerun from the failed action like
>>the
>> >> > oozie.wf.rerun.failnodes option in workflow rerun?
>> >> >
>> >> > If I recall correctly you cannot do this.
>> >>
>> >>
>> >> > 2. For option 1, seems I cannot change the job configurations. But
>>for
>> >> > option 2, I have more flexibility in changing the configurations,
>>say
>> >>I
>> >> can
>> >> > change the job name so that I know how many rerun has been made for
>> >>that
>> >> > workflow.
>> >> >
>> >> > no need for this, there is a WF rerun count.
>> >>
>> >>
>> >> > 3. If I chose option 2, does it mean that the rerun workflow job is
>> >>not
>> >> > part of the coordinator actions any more? In another word, if I
>>killed
>> >> that
>> >> > coordinator job, that rerun workflow job will be still running?
>> >>
>> >>
>> >> It should get killed as well as the WF job ID is still the same as.
>> >>
>> >> Wit Option #2 though I'm not sure what will happen with the status of
>> >>the
>> >> corresponding COORD action.
>> >>
>> >>
>> >>
>> >> >
>> >> >
>> >> Thanks
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Alejandro
>> >>
>>
>>

Reply via email to