Cloud dataflow is not scaling

André Rocha Silva Tue, 31 Dec 2019 05:26:00 -0800

Hi!

I have a cloud dataflow job that is not scaling.


The job sequence is the following:
1 -  [io] Read from a file in the bucket (1 element out)
2 - [ParDo] With the file information, get a query from a database (10,000
elements out)
3 - [ParDo] Works with the elements

But when I read from a file that already contains the same database query
result it scales to 60+ workers:
1 -  [io] Read from a file in the bucket (10,000 elements out)
2 - [ParDo] Works with the elements

Do I have to develop an I/O connector for the apache beam to know how many
elements its dealing with?

Best regards
André Rocha Silva

Cloud dataflow is not scaling

Reply via email to