Thanks for that angle, Jarek. Lets say DB lookup has higher precedence than that of say ENV backend. Wouldn't this be shooting ourselves in the foot by compromising the performance here? DB lookup will be more expensive than DB.
There could also be a few more side effects that we will have to fully uncover and come up with a detailed plan to allow this to be configurable. Thanks & Regards, Amogh Desai On Wed, Jul 2, 2025 at 6:43 PM Jarek Potiuk <ja...@potiuk.com> wrote: > I think this is a good idea - but as Ash mentioned, it has to be executed > well with a lot of bells and whistles, so that users will not shoot > themselves in their foot. For example we had recently discussions on the > new UI whether/how to explain the users that their connections in UI and > API **only** show the DB connections (for good reasons) - and it is already > difficult to explain to the users, now - this change will also make it > behave differently (for example - currently when you edit connection via UI > it might **not** get into effect if you have same connection defined in the > secret/env var. But if you make DB first - this changes and there are few > edge-cases where it might have some unexpected effect. > > But there is one inevitable benefit of this approach that I like - the > ability of turning airflow DB into an effective "shield" for secret usage. > The big drawback of the current "sequence" is that airflow generates a LOT > of queries to Secrets' manager, even if your connection is defined in the > DB - because it will query secrets first. So currently it is not possible > to say "for this, highly frequently used connection I want to keep it in DB > to save on the secret's manager queries - both performance and cost wise - > because defining connection in the DB does not limit the number of secret > manager's queries. So in a number of scenarios, being able to revert it and > query DB first might be very good for cost and network optimisation. > > I think if we describe it (as Ash wrote) well in the docs and explain those > scenarios and also clearly communicate it in the UI if Airflow (we need to > likely have some way of explaining the user what is their currently > configured sequence and what they should expect to happen if they > remove/add connection) - then I see it as a really useful feature. > > J. > > On Wed, Jul 2, 2025 at 2:54 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > > At a high level I’m good with allowing this to be fully configurable, as > > long as we document the possible warts (“Doctor, it hurts when I do this” > > “well don’t do that then!” etc) — though as Amogh mentioned it is > slightly > > complicated by the distinction between API Server/Scheduler and the > > execution time on the worker. > > > > (I haven’t looked at the specific implementation yet) > > > > -ash > > > > > On 2 Jul 2025, at 11:56, Amogh Desai <amoghdesai....@gmail.com> wrote: > > > > > > Hello Anton, > > > > > > Thanks for kicking off this discussion. I’d love to understand your > > > motivations a bit more on this front. > > > From your PR, I am seeing that you are just not allowing addition of > > > multiple custom backends > > > but also changing the *default_backend* order. I am a bit torn on that > > > part. > > > > > > The current design intentionally places the metadata DB backend at the > > > lowest precedence in the order, > > > since it’s meant to serve as the ultimate fallback source of truth. Any > > > additional configured > > > backends are prioritized higher than it by design. > > > > > > With your changes, we now allow configurations like: > > > > > > > > > > > > * @conf_vars({("secrets", "backends_order"): > > > "metastore,environment_variable,unsupported"}) def > > > test_backends_order_unsupported(self): with > > > pytest.raises(AirflowConfigException): > > ensure_secrets_loaded()* > > > > > > I don’t fully understand the motivation behind supporting this level of > > > override, especially since it > > > could allow unsupported or unintended configurations. Additionally, > with > > > Airflow 3.0+, we already support > > > a multi layered secret backend resolution capability with the > > introduction > > > of secrets backend for workers. > > > Order goes as: > > > > > > *secrets backend on worker directly (optional) > env vars on worker > * > > > *reach out to api server [secrets backend defined here (optional) > env > > > vars on api server > metadata DB].* > > > > > > You will have to consider this angle too. > > > > > > In my opinion, a more practical and realistic use case would be to have > > the > > > ability to define multiple custom backends > > > both on worker or the API server. > > > > > > Looking forward to hearing more from you. > > > > > > Thanks & Regards, > > > Amogh Desai > > > > > > > > > On Wed, Jul 2, 2025 at 3:59 PM Anton Nitochkin < > ant.nitoch...@gmail.com> > > > wrote: > > > > > >> Hello, > > >> > > >> I'd like to discuss a new option that can be added via this PR: > > >> https://github.com/apache/airflow/pull/45931. > > >> > > >> Recently, I asked developers in Slack for their thoughts on the new > > >> variable [secrets]backend_order. Long story short: this option will > > >> introduce the ability to configure the backend order and control it > > using > > >> this variable. The default value will remain the same as in the > current > > >> version, so for users who don't need it, things will stay as they are > > now. > > >> > > >> Jarek Potiuk advised starting a conversation and discussing the PR to > > reach > > >> a consensus with the community. > > >> > > >> Can you please share your thoughts on the option and its > implementation? > > >> > > >> Anton Nitochkin > > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >