Hi, You may want to use more than one element in your Create to start the FlatMap process as with a runner that does Fusion <https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline#fusion-optimization>, the code will end up only being able to parallelize to 1. So make use of a Create with say O(10's) elements and have each one of those then do a partition of the for loop work.
Cheers Reza On Wed, 4 Nov 2020 at 08:57, André Rocha Silva < [email protected]> wrote: > Fellow users > > I am not very used to making streaming pipelines, but I have a batch to > write to pub/sub. > > My pipeline starts with a 'fake' element only to trigger the next step. > Then in a FlatMap I use a For that yields many elements inside a for. But > in the last step I've got only 100 elements coming in. > Should I work with windowing or something like that? > my_pipeline = ( > p > | 'Creating pipeline' >> beam.Create(['1']) > | 'Get things' >> beam.FlatMap(GetThings) > | 'Post on Pub/Sub' >> beam.io.WriteToPubSub(topic > =user_options.topic.get()) > ) > > I am working on python. Apache beam 2.17, Python 3.7 > > Thank you for helping me! > > -- > > *ANDRÉ ROCHA SILVA* > * DATA ENGINEER* > (48) 3181-0611 > > <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/ > <http://portaltelemedicina.com.br/> > <https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ> > <https://pt-br.facebook.com/PortalTelemedicina/> > <https://www.linkedin.com/company/9426084/> > >
