Josef is proposing to make ObjectStoragePath construction
environment-agnostic by storing provider and base path
in Airflow connections. So you just need to change the connection
configuration.

This makes indeed a tighter coupling and makes me wonder about the
deployment model. While creating ObjectStoragePath I
had standard CI/CD practices in mind, where these things typically get set
through environment variables. This does
assume a deployment-wide setting obviously and is not runtime selectable.
So to understand the case better I like to
know more about the need for runtime selection.

Care to clarify?

Cheers
Bolke




On Sun, 3 Aug 2025 at 22:28, Jarek Potiuk <ja...@potiuk.com> wrote:

> Bolke (or others) - maybe you can add something here and (re) ignite the
> discussion ?
>
> On Tue, Jul 22, 2025 at 8:40 PM Josef Šimánek <josef.sima...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > I've been experimenting with `ObjectStoragePath` and recently opened a
> > [PR](https://github.com/apache/airflow/pull/52002) aiming to simplify
> > its construction using Airflow connections — especially in cases where
> > environments (e.g., dev, staging, prod) differ primarily in object
> > storage provider (e.g., S3, GCS, file) and base path.
> >
> > The goal was to construct a reusable root path from a connection like
> this:
> >
> > ```python
> > storage = ObjectStoragePath.from_conn(BaseHook.get_connection("storage"))
> > object = storage / "my_file.txt"
> > ```
> >
> > ...without needing to hardcode schemes like `s3://` or `gs://` and
> > base paths (usually "buckets") into the DAG code. The idea was to
> > infer provider and base path from connection `extra` fields (e.g.,
> > `provider`, `base_path`), allowing the same DAG code to work across
> > environments by simply reconfiguring the connection.
> >
> > The PR sparked a great discussion (linked above), and I realized this
> > might be a good opportunity to collect **broader community
> > experience** around the use of `ObjectStoragePath` and object storage
> > in general.
> >
> > A few questions I'd like to raise:
> >
> > * How are you configuring access to object storage across environments?
> > * Do you find it useful to extract `scheme` and `base_path` from
> > connections (or any other configuration)?
> > * Are there existing best practices or patterns for making
> > `ObjectStoragePath` construction generic and environment-agnostic?
> > * Would it make sense to define a common utility or convention (e.g.
> > via extras, `get_fs`, provider's `filesystems`, or a connection
> > helper)?
> >
> > I’m primarily looking for the best pattern—if any exists—or hoping we
> > can come together to define and document one as a community.
> >
> > Best regards,
> > Josef Šimánek (https://github.com/simi)
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to