Hi Stephan, I am not sure if this is the best way to achieve this, but I've seen parallelism being limited by using state / KV and limiting the number of keys. In your case, you could have the same key for both non concurrency-safe operations and when using state, the Beam model will guarantee that they aren't concurrently executed.
This blog post may be helpful: https://beam.apache.org/blog/stateful-processing/ On Mon, Jun 12, 2023 at 2:21 PM Stephan Hoyer via dev <dev@beam.apache.org> wrote: > Can the Beam data model (specifically the Python SDK) support executing > functions that are idempotent but not concurrency-safe? > > I am thinking of a task like setting up a database (or in my case, a Zarr > <https://zarr.dev/> store in Xarray-Beam > <https://github.com/google/xarray-beam>) where it is not safe to run > setup concurrently, but if the whole operation fails it is safe to retry. > > I recognize that a better model would be to use entirely atomic > operations, but sometimes this can be challenging to guarantee for tools > that were not designed with parallel computing in mind. > > Cheers, > Stephan >