Josef is proposing to make ObjectStoragePath construction environment-agnostic by storing provider and base path in Airflow connections. So you just need to change the connection configuration.
This makes indeed a tighter coupling and makes me wonder about the deployment model. While creating ObjectStoragePath I had standard CI/CD practices in mind, where these things typically get set through environment variables. This does assume a deployment-wide setting obviously and is not runtime selectable. So to understand the case better I like to know more about the need for runtime selection. Care to clarify? Cheers Bolke On Sun, 3 Aug 2025 at 22:28, Jarek Potiuk <ja...@potiuk.com> wrote: > Bolke (or others) - maybe you can add something here and (re) ignite the > discussion ? > > On Tue, Jul 22, 2025 at 8:40 PM Josef Šimánek <josef.sima...@gmail.com> > wrote: > > > Hi everyone, > > > > I've been experimenting with `ObjectStoragePath` and recently opened a > > [PR](https://github.com/apache/airflow/pull/52002) aiming to simplify > > its construction using Airflow connections — especially in cases where > > environments (e.g., dev, staging, prod) differ primarily in object > > storage provider (e.g., S3, GCS, file) and base path. > > > > The goal was to construct a reusable root path from a connection like > this: > > > > ```python > > storage = ObjectStoragePath.from_conn(BaseHook.get_connection("storage")) > > object = storage / "my_file.txt" > > ``` > > > > ...without needing to hardcode schemes like `s3://` or `gs://` and > > base paths (usually "buckets") into the DAG code. The idea was to > > infer provider and base path from connection `extra` fields (e.g., > > `provider`, `base_path`), allowing the same DAG code to work across > > environments by simply reconfiguring the connection. > > > > The PR sparked a great discussion (linked above), and I realized this > > might be a good opportunity to collect **broader community > > experience** around the use of `ObjectStoragePath` and object storage > > in general. > > > > A few questions I'd like to raise: > > > > * How are you configuring access to object storage across environments? > > * Do you find it useful to extract `scheme` and `base_path` from > > connections (or any other configuration)? > > * Are there existing best practices or patterns for making > > `ObjectStoragePath` construction generic and environment-agnostic? > > * Would it make sense to define a common utility or convention (e.g. > > via extras, `get_fs`, provider's `filesystems`, or a connection > > helper)? > > > > I’m primarily looking for the best pattern—if any exists—or hoping we > > can come together to define and document one as a community. > > > > Best regards, > > Josef Šimánek (https://github.com/simi) > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >