Beam does support parallelism for the job which applies to all the
transforms in the job when executing on Flink using the "--parallelism"
flag.

>From the usecase you mentioned, Kafka read operations will be over
parallelised but it should be ok as they will only have a small amount of
memory impact in loading some state for kafka client etc.
Also flink can run multiple operations for the same Job in a single task
slot so having higher parallelism for lightweight operations should not be
a problem.

On Wed, Apr 29, 2020 at 6:28 PM Luke Cwik <[email protected]> wrote:

> Beam doesn't expose such a thing directly but the FlinkRunner may be able
> to take some pipeline options to configure this.
>
> On Wed, Apr 29, 2020 at 5:51 PM Eleanore Jin <[email protected]>
> wrote:
>
>> Hi Kyle,
>>
>> I am using Flink Runner (v1.8.2)
>>
>> Thanks!
>> Eleanore
>>
>> On Wed, Apr 29, 2020 at 10:33 AM Kyle Weaver <[email protected]> wrote:
>>
>>> Which runner are you using?
>>>
>>> On Wed, Apr 29, 2020 at 1:32 PM Eleanore Jin <[email protected]>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I just wonder can Beam allow to set parallelism for each operator
>>>> (PTransform) separately? Flink provides such feature.
>>>>
>>>> The usecase I have is the source is kafka topics, which has less
>>>> partitions, while we have heavy PTransform and would like to scale it with
>>>> more parallelism.
>>>>
>>>> Thanks a lot!
>>>> Eleanore
>>>>
>>>

Reply via email to