Re: Staging a PCollection in Beam | Dataflow Runner

2022-10-19 Thread Reuven Lax via user
PCollections's usually are persistent within a pipeline, so you can reuse them in other parts of a pipeline with no problem. There is no notion of state across pipelines - every pipeline is independent. If you want state across pipelines you can write the PCollection out to a set of files which

Re: Staging a PCollection in Beam | Dataflow Runner

2022-10-19 Thread Israel Herraiz via user
I think that would be a Reshuffle , but only within the context of the same job (e.g. if there is a failure and a retry, the retry would start from the checkpoint created by the reshuffle). In Dataflow,