Hi Pablo, This is the input data we are testing
Elements added38,792,932 Estimated size3.14 GB On Wed, Mar 24, 2021 at 5:09 PM Pablo Estrada <[email protected]> wrote: > Hi David, > Thanks for sharing. I'm investigating something like this recently. What's > the size of your data? > Best > -P. > > On Wed, Mar 24, 2021, 7:52 AM David Sánchez <[email protected]> wrote: > >> Hi folks! >> >> I'm testing the dataflow v2 runner in a batch pipeline (Apache Beam >> Python 3.7 SDK 2.27.0) that reads many million of rows from BigQuery and >> writes to PubSub and BigQuery using the flag "--experiments=use_runner_v2". >> >> The same job used to scale up immediately to over 50 workers, but in v2 >> it never scales up further than 5-6 workers, thus it's way slower. I can >> see however that the total vCPU and memory are about half than before, >> which is promising. Any clue about why the scaling is behaving differently? >> >> Many thanks >> >
