If you want tight control on parallelism, you can 'reshuffle' the elements
into fixed number of shards. See
https://stackoverflow.com/questions/46116443/dataflow-streaming-job-not-scaleing-past-1-worker
On Tue, May 15, 2018 at 11:21 AM Harshvardhan Agrawal <
harshvardhan.ag...@gmail.com> wrote:
>
Thanks Ismaël.
Your answers were quite useful for a novice user.
I guess this answer will help many like me.
*Regarding your answer to point 2 :*
*"Checkpointing is supported, Kafka offset management (if I understand
whatyoumean) is managed by the KafkaIO connector + the runner"*
Beam provides
Exact mechanism of controlling parallelism is runner specific. Looks like
Flink allows users to to specify the amount of parallelism (per job) using
following option.
https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L65
How do we control parallelism of a particular step then? Is there a
recommended approach to solve this problem?
On Wed, May 16, 2018 at 20:45 Chamikara Jayalath
wrote:
> I don't think this can be specified through Beam API but Flink runner
> might have additional configurations that I'm not awar
I don't think this can be specified through Beam API but Flink runner might
have additional configurations that I'm not aware of. Also, many runners
fuse steps to improve the execution performance. So simply specifying the
parallelism of a single step will not work.
Thanks,
Cham
On Tue, May 15, 2
Makes sense.
On Tue, May 15, 2018 at 7:22 PM Harshvardhan Agrawal <
harshvardhan.ag...@gmail.com> wrote:
> It’s more like I have a window and I create a side input for that window.
> Once I am done with processing the window I want to discard that side input
> and create a new ones for subsequent
Hello,
Answers to the questions inline:
> 1. Are there any limitations in terms of implementations, functionalities
or performance if we want to run streaming on Beam with Spark runner vs
streaming on Spark-Streaming directly ?
At this moment the Spark runner does not support some parts of the B