Attila Sasvari created OOZIE-2882:
-------------------------------------
Summary: Rerun workflow fails Error: E0404
Key: OOZIE-2882
URL: https://issues.apache.org/jira/browse/OOZIE-2882
Project: Oozie
Issue Type: Improvement
Reporter: Attila Sasvari
Only one of the properties are allowed [oozie.wf.rerun.skip.nodes OR
oozie.wf.rerun.failnodes]
Reproduction:
1. Create a workflow with more than 1 node. Eg: Fork - with three parallel
shell actions. Make sure one of them fails
2. Rerun with 'oozie.wf.rerun.failnodes' set.
3. Rerun again with 'oozie.wf.rerun.skip.nodes' and check 'Skip all successful
nodes'.
You will get the following error.
Error: E0404 : E0404: Only one of the properties are allowed
[oozie.wf.rerun.skip.nodes OR oozie.wf.rerun.failnodes]
When a user reruns a workflow job with oozie.wf.rerun.failnode=true and if the
job fails in subsequent steps, we do not have an option to resubmit the
workflow using oozie.wf.rerun.skip.node=action1,action2 to allow submission
from predecessor steps.
Currently, once the workflow fails and one of the rerun options is used for job
rerun it gets merged and there is no way to override like regular oozie
configurations or variables.
We have a few options:
1. If fail.nodes and skip.nodes are specified at the same time (or one of them
was carried over from a previous wf run), we can add {generate skip.nodes by
discovering nodes that did not fail} union {skip.nodes}
2. Add a way to remove properties (this is also is potentially helpful for
other use cases)
3. The "newest" property (oozie.wf.rerun.skip.nodes or
oozie.wf.rerun.failnodes) takes priority and the previous is ignored
4. Make oozie.wf.rerun.skip.nodes or oozie.wf.rerun.failnodes somehow not
persist in the DB
Part of this JIRA would be to figure out which is the best option.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)