We do also have an active JIRA issue to support limiting parallelism on a per-step basis, BEAM-68
https://issues.apache.org/jira/browse/BEAM-68 As Kenn noted, this is not equivalent to controls over bundling, which is entirely determined by the runner. On Fri, Jun 24, 2016 at 1:25 PM, Shen Li <[email protected]> wrote: > Hi Kenn, > > Thanks for the explanation. > > Regards, > > Shen > > On Fri, Jun 24, 2016 at 4:09 PM, Kenneth Knowles <[email protected]> > wrote: > > > Hi Shen, > > > > It is completely up to the runner how to divide things into bundles: it > is > > one item of work that should fail or succeed atomically. Bundling limits > > parallelism, but does not determine it. For example, a streaming > execution > > may have many bundles over time as elements arrive, regardless of > > parallelism. > > > > Kenn > > > > On Fri, Jun 24, 2016 at 12:13 PM, Shen Li <[email protected]> wrote: > > > > > Hi, > > > > > > The document says "when a ParDo transform is executed, the elements of > > the > > > input PCollection are first divided up into some number of bundles". > > > > > > How do users control the number of bundles/parallelism? Or is it > > completely > > > up to the runner? > > > > > > Thanks, > > > > > > Shen > > > > > >
