[ 
https://issues.apache.org/jira/browse/OOZIE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryota Egashira updated OOZIE-1791:
----------------------------------

    Description: 
data pipeline customers need clean way to ignore coord job from bundle job in 
runtime.
currently to make a change to failed coordinator job after bundle submission, 
it is required to relaunch entire bundle after fixing coord xml or run another 
bundle with new coord. OOZIE-1769 to support changing coord properties in 
runtime. in many scenarios, customers want to ignore the failed coord from the 
bundle first, and fix the coord, put it back to the bundle, without stopping 
entire bundle job.

suggested approach is to add IGNORED status to Coordinator Job/Action. once CLI 
command (oozie job -ignore <coord_job_id>) to ignore coord job issued, change 
status of coord job to IGNORED (bundle action as well).  Ignored coordinator 
job doesn't impact the state of it's parent bundle job. e.g., after a 
coordinator job failed, then ignored (suppose other coord jobs succeeded)  the 
bundle becomes RUNNING, not RUNNINGWITHERROR. Ignored coordinator job is 
excluded from Bundle suspend/kill/rerun operation.
 
Only coord job in terminal state (KILLED or FAILED) can be changed to IGNORED.  
 PREP, RUNNING, WAITING, SUCCEEDED, SUSPENDED ones cannot be ignored. Also we 
can change ignored job back to running state by using coord rerun.

The same concept can be applied to coordinator action in a coordinator job.   
once coord action is ignored,  it doesn't impact state of its parent 
coordinator job, and excluded from coordinator job operation.

We'd like to have discussion on this approach before proceed. Any feedback 
appreciated. 

  was:
data pipeline customers need clean way to remove coord job from bundle job in 
runtime. 
currently to make a change to coordinator after bundle submission, it is 
required to relaunch entire bundle after fixing coord xml or run another bundle 
with new coord. OOZIE-1769 to support changing coord properties in runtime. in 
many scenarios, customers want to remove the coord from the bundle first, and 
fix the coord, put it back, without stopping entire bundle job.

suggested approach is to add IGNORED status to Coord Job/Actions. once CLI 
command to remove coord job from bundle issued, change status of coord 
job/actions to IGNORED (probably bundle aciton, too).  If one of coordinators 
in the bundle has IGNORED status, then the status of the bundle would be 
RUNNING, not RUNNINGWITHERROR. (StatusTransitService need to be changed for 
couple of other cases).
Only coord job/actions in terminal state (SUSPENDED, KILLED or FAILED) can be 
changed to IGNORED. PREP, RUNNING, WAITING or SUCCEEDED ones cannot be changed.

We'd like to have discussion on this approach before proceed. Any feedback 
appreciated. 


> add IGNORED status to Coordinator Job and Action
> ------------------------------------------------
>
>                 Key: OOZIE-1791
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1791
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Ryota Egashira
>            Assignee: Ryota Egashira
>
> data pipeline customers need clean way to ignore coord job from bundle job in 
> runtime.
> currently to make a change to failed coordinator job after bundle submission, 
> it is required to relaunch entire bundle after fixing coord xml or run 
> another bundle with new coord. OOZIE-1769 to support changing coord 
> properties in runtime. in many scenarios, customers want to ignore the failed 
> coord from the bundle first, and fix the coord, put it back to the bundle, 
> without stopping entire bundle job.
> suggested approach is to add IGNORED status to Coordinator Job/Action. once 
> CLI command (oozie job -ignore <coord_job_id>) to ignore coord job issued, 
> change status of coord job to IGNORED (bundle action as well).  Ignored 
> coordinator job doesn't impact the state of it's parent bundle job. e.g., 
> after a coordinator job failed, then ignored (suppose other coord jobs 
> succeeded)  the bundle becomes RUNNING, not RUNNINGWITHERROR. Ignored 
> coordinator job is excluded from Bundle suspend/kill/rerun operation.
>  
> Only coord job in terminal state (KILLED or FAILED) can be changed to 
> IGNORED.   PREP, RUNNING, WAITING, SUCCEEDED, SUSPENDED ones cannot be 
> ignored. Also we can change ignored job back to running state by using coord 
> rerun.
> The same concept can be applied to coordinator action in a coordinator job.   
> once coord action is ignored,  it doesn't impact state of its parent 
> coordinator job, and excluded from coordinator job operation.
> We'd like to have discussion on this approach before proceed. Any feedback 
> appreciated. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to