> This seems like
organisation-wide policy that simply all DAG authors in the organization
should be made aware of

One among several other things that the admin expects users to remember. We
should reduce it, not increase it.
>From my point of view this setting adds a blind spot. I am not happy with
this.
I have similar feelings towards cluster policies, yet another blind spot
that dag authors should be aware of but no actual tools provided to see the
override in their side.

I initially shared my thoughts on 31 March in
https://github.com/apache/airflow/pull/45931#discussion_r2021018760
So far I haven't seen any comments that explain why we can't implement such
a mechanism. Is it technically complicated? Is it high effort? or
the assumption is that it serves little value?


On Sun, Jul 6, 2025 at 3:12 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> > I am missing the part of how can DAG Author be aware of the backend order
> the cluster admin chooses?
> > This is a crucial part
>
> I am not sure there is a special need for it. This seems like
> organisation-wide policy that simply all DAG authors in the organization
> should be made aware of - it has 0 impact on the way how DAGs are written.
> If it would be different for different DAGs you'd surely need to
> communicate this, but I am not sure if any other indication is needed. It's
> largely transparent for `DAG authors` if you ask me - they want a
> connection by id and the "organizational policy" decides how this happens.
>
> J.
>
>
> On Sun, Jul 6, 2025 at 2:06 PM Elad Kalif <elad...@apache.org> wrote:
>
> > I am missing the part of how can DAG Author be aware of the backend order
> > the cluster admin chooses?
> > This is a crucial part.
> >
> > On Thu, Jul 3, 2025 at 12:14 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Sorry for typos - that was my mobile auto complete... I hope it is
> > > understandable anyway
> > >
> > > czw., 3 lip 2025, 11:13 użytkownik Jarek Potiuk <ja...@potiuk.com>
> > > napisał:
> > >
> > > >
> > > >
> > > >
> > > > czw., 3 lip 2025, 10:14 użytkownik Amogh Desai <
> > amoghdesai....@gmail.com
> > > >
> > > > napisał:
> > > >
> > > >> Thanks for that angle, Jarek.
> > > >>
> > > >> Lets say DB lookup has higher precedence than that of say ENV
> backend.
> > > >> Wouldn't this be shooting ourselves in the foot by compromising the
> > > >> performance here? DB lookup
> > > >> will be more expensive than DB.
> > > >>
> > > >>
> > > > Oh absolutely. I think if we have this possibility of managing order
> > > those
> > > > kind of scenarios alshould be explained in the docs so that users do
> > not
> > > > shoot themselves in a foot
> > > >
> > > > Also following my mail about multi team. I started to think recently
> -
> > > > looking at some other OSS software thetwe sometimes take too much
> > > > responsibility for our users and the snuffer be cause we have to
> defend
> > > out
> > > > opinionated choices when there are use cases that outlet choices do
> not
> > > > enable.
> > > >
> > > > This is the reason why we have so many 'options' and config values
> > > because
> > > > sometimes we do not want to make decisions for our users - but where
> we
> > > can
> > > > make it an option and configuration and clearly explain to o lut
> users
> > > (and
> > > > mostly I am talking about Deployment Manager role from our security
> > > model).
> > > > - it's their responsibility to read all the information we provide
> and
> > > > follow it when they make decisions on how to configure Airflow -
> > knowing
> > > > the consequences. And we should be 'harsh' with them - in the sense
> > that
> > > if
> > > > they did not read the docs and did not understand it - any time they
> > ask
> > > > imus about something not working that is explained in the docs - we
> > > should
> > > > send them to the doc with 'Read The Friendly Manual' advice - simply
> > > > because this is the only job they have. And we should not do the job
> > for
> > > > them.
> > > >
> > > > Similarly having operations like that allow our managed service
> > providers
> > > > to make their opinionated choices and make some configuration options
> > > > possible, some selected for their users in the context of the service
> > > > managed. But again - that's their responsibility to manage and
> > understand
> > > > what are the options and what they mean. Same as individual
> deployment
> > > > managers - they can make their own decisions - and if it does not
> cost
> > > us a
> > > > lot we should make it possible for them to make those choices (and
> take
> > > > responsibility for their choices)
> > > >
> > > > With great powers (of choice) you also have great responsibilities
> (of
> > > > consequences of your choices) - and as long we are aware of those
> > > > consequences and communicate it to deployment managers - it's on
> their
> > > > shoulders to make the choices and bear the consequences.
> > > >
> > > > J.
> > > >
> > > >
> > > >
> > > > There could also be a few more side effects that we will have to
> fully
> > > >> uncover and come up
> > > >> with a detailed plan to allow this to be configurable.
> > > >>
> > > >> Thanks & Regards,
> > > >> Amogh Desai
> > > >>
> > > >>
> > > >> On Wed, Jul 2, 2025 at 6:43 PM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > >>
> > > >> > I think this is a good idea - but as Ash mentioned, it has to be
> > > >> executed
> > > >> > well with a lot of bells and whistles, so that users will not
> shoot
> > > >> > themselves in their foot. For example we had recently discussions
> on
> > > the
> > > >> > new UI whether/how to explain the users that their connections in
> UI
> > > and
> > > >> > API **only** show the DB connections (for good reasons) - and it
> is
> > > >> already
> > > >> > difficult to explain to the users, now - this change will also
> make
> > it
> > > >> > behave differently (for example - currently when you edit
> connection
> > > >> via UI
> > > >> > it might **not** get into effect if you have same connection
> defined
> > > in
> > > >> the
> > > >> > secret/env var. But if you make DB first - this changes and there
> > are
> > > >> few
> > > >> > edge-cases where it might have some unexpected effect.
> > > >> >
> > > >> > But there is one inevitable benefit of this approach that I like -
> > the
> > > >> > ability of turning airflow DB into an effective "shield" for
> secret
> > > >> usage.
> > > >> > The big drawback of the current "sequence" is that airflow
> > generates a
> > > >> LOT
> > > >> > of queries to Secrets' manager, even if your connection is defined
> > in
> > > >> the
> > > >> > DB - because it will query secrets first. So currently it is not
> > > >> possible
> > > >> > to say "for this, highly frequently used connection I want to keep
> > it
> > > >> in DB
> > > >> > to save on the secret's manager queries - both performance and
> cost
> > > >> wise -
> > > >> > because defining connection in the DB does not limit the number of
> > > >> secret
> > > >> > manager's queries. So in a number of scenarios, being able to
> revert
> > > it
> > > >> and
> > > >> > query DB first might be very good for cost and network
> optimisation.
> > > >> >
> > > >> > I think if we describe it (as Ash wrote) well in the docs and
> > explain
> > > >> those
> > > >> > scenarios and also clearly communicate it in the UI if Airflow (we
> > > need
> > > >> to
> > > >> > likely have some way of explaining the user what is their
> currently
> > > >> > configured sequence and what they should expect to happen if they
> > > >> > remove/add connection) - then I see it as a really useful feature.
> > > >> >
> > > >> > J.
> > > >> >
> > > >> > On Wed, Jul 2, 2025 at 2:54 PM Ash Berlin-Taylor <a...@apache.org>
> > > >> wrote:
> > > >> >
> > > >> > > At a high level I’m good with allowing this to be fully
> > > configurable,
> > > >> as
> > > >> > > long as we document the possible warts (“Doctor, it hurts when I
> > do
> > > >> this”
> > > >> > > “well don’t do that then!” etc) — though as Amogh mentioned it
> is
> > > >> > slightly
> > > >> > > complicated by the distinction between API Server/Scheduler and
> > the
> > > >> > > execution time on the worker.
> > > >> > >
> > > >> > > (I haven’t looked at the specific implementation yet)
> > > >> > >
> > > >> > > -ash
> > > >> > >
> > > >> > > > On 2 Jul 2025, at 11:56, Amogh Desai <
> amoghdesai....@gmail.com>
> > > >> wrote:
> > > >> > > >
> > > >> > > > Hello Anton,
> > > >> > > >
> > > >> > > > Thanks for kicking off this discussion. I’d love to understand
> > > your
> > > >> > > > motivations a bit more on this front.
> > > >> > > > From your PR, I am seeing that you are just not allowing
> > addition
> > > of
> > > >> > > > multiple custom backends
> > > >> > > > but also changing the *default_backend* order. I am a bit torn
> > on
> > > >> that
> > > >> > > > part.
> > > >> > > >
> > > >> > > > The current design intentionally places the metadata DB
> backend
> > at
> > > >> the
> > > >> > > > lowest precedence in the order,
> > > >> > > > since it’s meant to serve as the ultimate fallback source of
> > > truth.
> > > >> Any
> > > >> > > > additional configured
> > > >> > > > backends are prioritized higher than it by design.
> > > >> > > >
> > > >> > > > With your changes, we now allow configurations like:
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > *    @conf_vars({("secrets", "backends_order"):
> > > >> > > > "metastore,environment_variable,unsupported"})    def
> > > >> > > > test_backends_order_unsupported(self):        with
> > > >> > > > pytest.raises(AirflowConfigException):
> > > >> > > ensure_secrets_loaded()*
> > > >> > > >
> > > >> > > > I don’t fully understand the motivation behind supporting this
> > > >> level of
> > > >> > > > override, especially since it
> > > >> > > > could allow unsupported or unintended configurations.
> > > Additionally,
> > > >> > with
> > > >> > > > Airflow 3.0+, we already support
> > > >> > > > a multi layered secret backend resolution capability with the
> > > >> > > introduction
> > > >> > > > of secrets backend for workers.
> > > >> > > > Order goes as:
> > > >> > > >
> > > >> > > > *secrets backend on worker directly (optional) > env vars on
> > > worker
> > > >> > *
> > > >> > > > *reach out to api server [secrets backend defined here
> > (optional)
> > > >
> > > >> env
> > > >> > > > vars on api server > metadata DB].*
> > > >> > > >
> > > >> > > > You will have to consider this angle too.
> > > >> > > >
> > > >> > > > In my opinion, a more practical and realistic use case would
> be
> > to
> > > >> have
> > > >> > > the
> > > >> > > > ability to define multiple custom backends
> > > >> > > > both on worker or the API server.
> > > >> > > >
> > > >> > > > Looking forward to hearing more from you.
> > > >> > > >
> > > >> > > > Thanks & Regards,
> > > >> > > > Amogh Desai
> > > >> > > >
> > > >> > > >
> > > >> > > > On Wed, Jul 2, 2025 at 3:59 PM Anton Nitochkin <
> > > >> > ant.nitoch...@gmail.com>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > >> Hello,
> > > >> > > >>
> > > >> > > >> I'd like to discuss a new option that can be added via this
> PR:
> > > >> > > >> https://github.com/apache/airflow/pull/45931.
> > > >> > > >>
> > > >> > > >> Recently, I asked developers in Slack for their thoughts on
> the
> > > new
> > > >> > > >> variable [secrets]backend_order. Long story short: this
> option
> > > will
> > > >> > > >> introduce the ability to configure the backend order and
> > control
> > > it
> > > >> > > using
> > > >> > > >> this variable. The default value will remain the same as in
> the
> > > >> > current
> > > >> > > >> version, so for users who don't need it, things will stay as
> > they
> > > >> are
> > > >> > > now.
> > > >> > > >>
> > > >> > > >> Jarek Potiuk advised starting a conversation and discussing
> the
> > > PR
> > > >> to
> > > >> > > reach
> > > >> > > >> a consensus with the community.
> > > >> > > >>
> > > >> > > >> Can you please share your thoughts on the option and its
> > > >> > implementation?
> > > >> > > >>
> > > >> > > >> Anton Nitochkin
> > > >> > > >>
> > > >> > >
> > > >> > >
> > > >> > >
> > > ---------------------------------------------------------------------
> > > >> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > >> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to