[
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104618#comment-14104618
]
Mona Chitnis commented on OOZIE-1976:
-------------------------------------
For Ryota's comment about priority, I think it complicates the missing
dependencies field, now we require a structure to indicate something like
{{P0=dep1,dep2#P1=dep3,dep4}} which in turn is nested under the AND/OR
structure. So when dependencies are checked and found to exist, action will
start only when all P0's are satisfied etc. I think this is essentially same as
putting them in the <AND> block instead of optional <OR> block. For the N out
of M case, it will start when _any_ instances >=n are available, using all M if
all there, and not limit to N there.
Good pointer about EL functions, that one's going to be important and will
probably need a few new ones.
> Specifying coordinator input datasets in more logical ways
> ----------------------------------------------------------
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
> Issue Type: New Feature
> Components: coordinator
> Affects Versions: trunk
> Reporter: Mona Chitnis
> Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on
> AND logic i.e. ALL of them should be available for workflow to start. We
> should enhance this to include more logical ways of specifying availability
> criteria e.g.
> * OR between instances
> * minimum N out of K instances
> * delta datasets (process data incrementally)
> Use-cases for this:
> * Different datasets are BCP, and workflow can run with either, whichever
> arrives earlier.
> * Data is not guaranteed, and while $coord:latest allows skipping to
> available ones, workflow will never trigger unless mentioned number of
> instances are found.
> * Workflow is like a ‘refining’ algorithm which should run after minimum
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for
> some or all of the above features.
--
This message was sent by Atlassian JIRA
(v6.2#6252)