We do also have an active JIRA issue to support limiting parallelism on a
per-step basis, BEAM-68

https://issues.apache.org/jira/browse/BEAM-68

As Kenn noted, this is not equivalent to controls over bundling, which is
entirely determined by the runner.

On Fri, Jun 24, 2016 at 1:25 PM, Shen Li <[email protected]> wrote:

> Hi Kenn,
>
> Thanks for the explanation.
>
> Regards,
>
> Shen
>
> On Fri, Jun 24, 2016 at 4:09 PM, Kenneth Knowles <[email protected]>
> wrote:
>
> > Hi Shen,
> >
> > It is completely up to the runner how to divide things into bundles: it
> is
> > one item of work that should fail or succeed atomically. Bundling limits
> > parallelism, but does not determine it. For example, a streaming
> execution
> > may have many bundles over time as elements arrive, regardless of
> > parallelism.
> >
> > Kenn
> >
> > On Fri, Jun 24, 2016 at 12:13 PM, Shen Li <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > The document says "when a ParDo transform is executed, the elements of
> > the
> > > input PCollection are first divided up into some number of bundles".
> > >
> > > How do users control the number of bundles/parallelism? Or is it
> > completely
> > > up to the runner?
> > >
> > > Thanks,
> > >
> > > Shen
> > >
> >
>

Reply via email to