Hi Robert,

Thank you for your reply.
The brackets indicate joins, when one action has several dependencies.
For example after forking in order to delete I need to wait that all the
direct
actions are finished.

In the above example, if all my error transitions go to join_end,  does the
workflow finished, or is it in
a dead lock?

Can I specify the oozie.validate.ForkJoin to false in my job.properties
file, or
when I start the job in java?

Thanks,

Étienne


On 16 November 2012 21:59, Robert Kanter <[email protected]> wrote:

> Hi Etienne,
>
>
> I'm not sure I follow exactly your notation; what do the [ and ] brackets
> indicate?
>
>
> You can have multiple joins in your workflow, so instead of having
> everything go to join_end, you can have a hierarchy of joins such that each
> fork in your workflow is in pair with a join.  There is likely a way to
> re-work your workflow to be like this.
>
>
> Oozie does some extra checking when you use a fork (e.g. fork and join in
> pair, etc).  I'm pretty sure that this is the only place where Oozie will
> enforce these restrictions.  In other words, if Oozie were to actually
> start executing your workflow, it wouldn't complain if they're not in a
> pair.  You can disable this extra checking by setting
> oozie.validate.ForkJoin to false in your oozie-site.xml.  This may be risky
> though and something could go wrong, so do not do this in a production
> cluster.
>
>
> - Robert
>
>
> On Fri, Nov 16, 2012 at 8:40 AM, Etienne Dumoulin <
> [email protected]> wrote:
>
> > All,
> >
> > I am trying to create an Oozie xml file from a DAG data structure.
> >
> > My data structure (and actions to run) would look like that:
> > start----->A1,A2
> > A1----> A7
> > A2----->A7,A3,A4
> > A3----->A6
> > A7,A6,A4---->end
> >
> >
> > Now if the actions output are only temporary, I would like to delete them
> > asap,
> > I name D the delete actions:
> > start----->A1,A2
> > A1----> A7
> > A2----->A7,A3,A4
> > A3----->A6,[A3,A4,A7]
> > A7----->D1,[A3,A4,A7]
> > A4----->[A3,A4,A7]
> > A3,A4,A7---->D2
> > A6 ----> D3
> > D1----->join_end
> > D2------>join_end
> > D3------>join_end
> > join_end ----->end
> >
> > I have two questions on forks/joins:
> > The documentation says that I need to have fork and join in pair, is that
> > still true?
> > Or is it only better, I just read a post from Virag the 19th of July:
> > "To me, the nested forks option you are considering looks good. Its also
> > better to have the join in pair."
> >
> > I would like that my workflow finishes even through one branch fails, let
> > say that
> > A7 fails, I would like that A6 and A4 proceed. For this type of behaviour
> > can
> > I link all my error transitions to join_end?
> > It will not be possible if I have to pair forks and joins.
> >
> > Regards,
> >
> > Étienne
> >
>
<[email protected]>

Reply via email to