Re: Conditional branching during pipeline execution time

Luke Cwik Tue, 07 Jul 2020 19:39:49 -0700

Have both DoFns A and B in the graph at the same time and instead use
another decider DoFn that outputs to either the PCollection that goes to
DoFn A or DoFn B based upon the contents of the side input. Graph would
look something like:


PCollectionView<Decision> -\
PCollection<Data> -> ParDo(Decider) -outA-> PCollection<Data> ->
ParDo(DoFnA)
                                    \outB-> PCollection<Data> ->
ParDo(DoFnB)

See[1] for how to create a DoFn with multiple outputs.

1:
https://beam.apache.org/documentation/pipelines/design-your-pipeline/#a-single-transform-that-produces-multiple-outputs

On Tue, Jul 7, 2020 at 7:31 PM Praveen K Viswanathan <
[email protected]> wrote:

> Hello Everyone,
>
> Apache Beam allows conditional branching during pipeline construction
> time, but I have to decide whether to execute DoFn A or DoFn B during run
> time (based upon a PCollection flag).
>
> My DoFns A and B are inside a custom transformation class and I am passing
> my flag as PCollectionView to the transformation class. However, Beam does
> not wait for the actual value of the PCollectionView and decides which DoFn
> to call during DAG preparation itself (always goes to else part)
>
> class CustomTx {
>    public CustomTx(flag) {
>     this.flag = flag;
>    }
>
>  public expand {
>   if (flag)
>      DoFn A
>   else
>      DoFn B
>   }
> }
>
> class DoFn A {
> }
>
> class DoFn B {
> }
>
> If I have a DoFn inside my transformation's expand method and pass the
> flag as side input it gives the correct value but then, I cannot call a
> DoFn inside a DoFn. Appreciate any pointers on the best way to approach
> this branching case.
>
> --
> Thanks,
> Praveen K Viswanathan
>

Re: Conditional branching during pipeline execution time

Reply via email to