Yes, I think that should be added. Its been waiting on RB for a long time
:)

Thanks,
virag

On 2/20/14 2:24 PM, "Alejandro Abdelnur" <[email protected]> wrote:

>This is what I refer as sharding, it can be seen as a special type of
>fork/join where all shards are doing the same actions on different
>datasets
>and the number of shards depends on the number of datasets.
>
>A while ago I've rewritten the workflow lib, cleanning it up a bit and
>adding this capability. But never got completed. If there is interest we
>could create an umbrella JIRA and complete the integration.
>
>Thanks.
>
>
>On Thu, Feb 20, 2014 at 1:47 PM, Mona Chitnis <[email protected]>
>wrote:
>
>> If you use the sub-workflow construct, then it would do some error
>> reporting for you. If a sub-workflow fails, the parent workflow also
>>gets
>> updated to failed. Also in Oozie 4.0, the JIRA OOZIE-1264 The "parent"
>> property of a subworkflow should be the ID of the parent workflow, helps
>> get the dependency graph using IDs.
>>
>>
>> On 2/20/14, 12:52 PM, "Heller, Chris" <[email protected]> wrote:
>>
>> >Mona,
>> >
>> >Thanks. That is the road I'm headed down. At the moment.
>> >
>> >I'll create a Java action which takes the files (or a path glob -- or
>> >something) as input, and create multiple Oozie tasks based on that
>>input,
>> >and then 'wait' for those tasks to complete.
>> >
>> >A feature like this built into the workflow certainly would be nice,
>>since
>> >it would better integrate error handling I think.
>> >
>> >-Chris
>> >
>> >On 2/20/14, 3:43 PM, "Mona Chitnis" <[email protected]> wrote:
>> >
>> >>Hi Chris,
>> >>
>> >>There isn¹t a way of dynamic parallel tasks within the same Oozie
>> >>workflow
>> >>XML currently. But you can do some programmatically. Using Oozie Java
>> >>API,
>> >>you can start a dynamic number of sub-workflows based on the number of
>> >>outputs.
>> >>
>> >>
>> >>On 2/20/14, 7:05 AM, "Heller, Chris" <[email protected]> wrote:
>> >>
>> >>>Hi,
>> >>>
>> >>>I¹m trying to figure out the best way to implement a workflow in
>>Oozie.
>> >>>
>> >>>I am creating a workflow which splits an input into multiple outputs.
>> >>>
>> >>>Then for each output I want to run another process over each.
>> >>>
>> >>>The trouble is I cannot know a-priori how many outputs I will have,
>>and
>> >>>so to post process each I don¹t see how to setup a workflow to run
>>the
>> >>>next stage.
>> >>>
>> >>>Ideally the next stage would be a fork/join type of scenario, since
>>each
>> >>>output can be processed independently. But there isn¹t any way I can
>>see
>> >>>to setup the fork paths without using some sort of XML generation
>> >>>preprocessor.
>> >>>
>> >>>Does anyone have a suggestion of how to proceed? Am I stuck doing
>> >>>workflow generation? Or is there another way to structure this
>>workflow
>> >>>using the existing primitives?
>> >>>
>> >>>Thanks,
>> >>>Chris
>> >>
>>
>>

Reply via email to