If you do a groupByKey followed by a fan out right before you're write steps, 
you'll prevent the write steps from starting until all the data has been 
grouped.

I'd recommend reading up about fusion: 
https://cloud.google.com/dataflow/service/dataflow-service-desc#preventing-fusion

Sent from my iPhone

> On Sep 5, 2017, at 11:21, Jacob Marble <[email protected]> wrote:
> 
> Good morning-
> 
> Given a batch pipeline with 3 file inputs and 4 file outputs, is there a way 
> to prevent the 4 TextIO.write() steps from starting until all of the 
> TexIO.write() steps are ready?
> 
> The idea here is to fail on any exceptions before persisting any output data, 
> making cleanup easier.
> 
> Thanks!
> 
> Jacob

Reply via email to