The approach looks good. but one question

My understanding is that this will schedule for example 8 operators across
the workers, but only one of them will be processing, the others
remain idle? Are those consuming resources in some way? I'm assuming may be
is not significant.

Thanks.

El El vie, 7 de jun de 2024 a la(s) 3:56 p.m., Robert Bradshaw via user <
[email protected]> escribió:

> You can always limit the parallelism by assigning a single key to
> every element and then doing a grouping or reshuffle[1] on that key
> before processing the elements. Even if the operator parallelism for
> that step is technically, say, eight, your effective parallelism will
> be exactly one.
>
> [1]
> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/Reshuffle.html
>
> On Fri, Jun 7, 2024 at 2:13 PM Ruben Vargas <[email protected]>
> wrote:
> >
> > Hello guys
> >
> > One question, I have a side input which fetches an endpoint each 30
> > min, I pretty much copied the example here:
> > https://beam.apache.org/documentation/patterns/side-inputs/ but added
> > some logic to fetch the endpoint and parse the payload.
> >
> > My question is: it is possible to control the parallelism of this
> > single ParDo that does the fetch/transform? I don't think I need a lot
> > of parallelism for that one. I'm currently using flink runner and I
> > see the parallelism is 8 (which is the general parallelism for my
> > flink cluster).
> >
> > Is it possible to set it to 1 for example?
> >
> >
> > Regards.
>

Reply via email to