Re: Splittable-Dofn not distributing the work to multiple workers

2020-08-21 Thread Luke Cwik
Yes it does. There should be a reshuffle between the initial splitting and the processing portion. On Fri, Aug 21, 2020 at 11:04 AM Jiadai Xia wrote: > I am using v1. Does v1 support the initial splitting and distribution? > since I expect it to distribute the initial splitting to multiple

Re: Splittable-Dofn not distributing the work to multiple workers

2020-08-21 Thread Jiadai Xia
I am using v1. Does v1 support the initial splitting and distribution? since I expect it to distribute the initial splitting to multiple workers. On Fri, Aug 21, 2020 at 11:00 AM Luke Cwik wrote: > Are you using Dataflow runner v2[1] since the default for Beam Java still > uses Dataflow runner

Re: Splittable-Dofn not distributing the work to multiple workers

2020-08-21 Thread Luke Cwik
Are you using Dataflow runner v2[1] since the default for Beam Java still uses Dataflow runner v1? Dataflow runner v2 is the only one that supports autoscaling and dynamic splitting of splittable dofns in bounded pipelines. 1:

Splittable-Dofn not distributing the work to multiple workers

2020-08-21 Thread Jiadai Xia
Hi, As stated in the title, I tried to implement a SDF for reading the Parquet file and I am trying to run it with Dataflow runner. As the initial split outputs a bunch of ranges but the number of workers are not scaled up and the work is not distributed. Any suggestion on what can be the problem?