Sorry for typos - that was my mobile auto complete... I hope it is understandable anyway
czw., 3 lip 2025, 11:13 użytkownik Jarek Potiuk <ja...@potiuk.com> napisał: > > > > czw., 3 lip 2025, 10:14 użytkownik Amogh Desai <amoghdesai....@gmail.com> > napisał: > >> Thanks for that angle, Jarek. >> >> Lets say DB lookup has higher precedence than that of say ENV backend. >> Wouldn't this be shooting ourselves in the foot by compromising the >> performance here? DB lookup >> will be more expensive than DB. >> >> > Oh absolutely. I think if we have this possibility of managing order those > kind of scenarios alshould be explained in the docs so that users do not > shoot themselves in a foot > > Also following my mail about multi team. I started to think recently - > looking at some other OSS software thetwe sometimes take too much > responsibility for our users and the snuffer be cause we have to defend out > opinionated choices when there are use cases that outlet choices do not > enable. > > This is the reason why we have so many 'options' and config values because > sometimes we do not want to make decisions for our users - but where we can > make it an option and configuration and clearly explain to o lut users (and > mostly I am talking about Deployment Manager role from our security model). > - it's their responsibility to read all the information we provide and > follow it when they make decisions on how to configure Airflow - knowing > the consequences. And we should be 'harsh' with them - in the sense that if > they did not read the docs and did not understand it - any time they ask > imus about something not working that is explained in the docs - we should > send them to the doc with 'Read The Friendly Manual' advice - simply > because this is the only job they have. And we should not do the job for > them. > > Similarly having operations like that allow our managed service providers > to make their opinionated choices and make some configuration options > possible, some selected for their users in the context of the service > managed. But again - that's their responsibility to manage and understand > what are the options and what they mean. Same as individual deployment > managers - they can make their own decisions - and if it does not cost us a > lot we should make it possible for them to make those choices (and take > responsibility for their choices) > > With great powers (of choice) you also have great responsibilities (of > consequences of your choices) - and as long we are aware of those > consequences and communicate it to deployment managers - it's on their > shoulders to make the choices and bear the consequences. > > J. > > > > There could also be a few more side effects that we will have to fully >> uncover and come up >> with a detailed plan to allow this to be configurable. >> >> Thanks & Regards, >> Amogh Desai >> >> >> On Wed, Jul 2, 2025 at 6:43 PM Jarek Potiuk <ja...@potiuk.com> wrote: >> >> > I think this is a good idea - but as Ash mentioned, it has to be >> executed >> > well with a lot of bells and whistles, so that users will not shoot >> > themselves in their foot. For example we had recently discussions on the >> > new UI whether/how to explain the users that their connections in UI and >> > API **only** show the DB connections (for good reasons) - and it is >> already >> > difficult to explain to the users, now - this change will also make it >> > behave differently (for example - currently when you edit connection >> via UI >> > it might **not** get into effect if you have same connection defined in >> the >> > secret/env var. But if you make DB first - this changes and there are >> few >> > edge-cases where it might have some unexpected effect. >> > >> > But there is one inevitable benefit of this approach that I like - the >> > ability of turning airflow DB into an effective "shield" for secret >> usage. >> > The big drawback of the current "sequence" is that airflow generates a >> LOT >> > of queries to Secrets' manager, even if your connection is defined in >> the >> > DB - because it will query secrets first. So currently it is not >> possible >> > to say "for this, highly frequently used connection I want to keep it >> in DB >> > to save on the secret's manager queries - both performance and cost >> wise - >> > because defining connection in the DB does not limit the number of >> secret >> > manager's queries. So in a number of scenarios, being able to revert it >> and >> > query DB first might be very good for cost and network optimisation. >> > >> > I think if we describe it (as Ash wrote) well in the docs and explain >> those >> > scenarios and also clearly communicate it in the UI if Airflow (we need >> to >> > likely have some way of explaining the user what is their currently >> > configured sequence and what they should expect to happen if they >> > remove/add connection) - then I see it as a really useful feature. >> > >> > J. >> > >> > On Wed, Jul 2, 2025 at 2:54 PM Ash Berlin-Taylor <a...@apache.org> >> wrote: >> > >> > > At a high level I’m good with allowing this to be fully configurable, >> as >> > > long as we document the possible warts (“Doctor, it hurts when I do >> this” >> > > “well don’t do that then!” etc) — though as Amogh mentioned it is >> > slightly >> > > complicated by the distinction between API Server/Scheduler and the >> > > execution time on the worker. >> > > >> > > (I haven’t looked at the specific implementation yet) >> > > >> > > -ash >> > > >> > > > On 2 Jul 2025, at 11:56, Amogh Desai <amoghdesai....@gmail.com> >> wrote: >> > > > >> > > > Hello Anton, >> > > > >> > > > Thanks for kicking off this discussion. I’d love to understand your >> > > > motivations a bit more on this front. >> > > > From your PR, I am seeing that you are just not allowing addition of >> > > > multiple custom backends >> > > > but also changing the *default_backend* order. I am a bit torn on >> that >> > > > part. >> > > > >> > > > The current design intentionally places the metadata DB backend at >> the >> > > > lowest precedence in the order, >> > > > since it’s meant to serve as the ultimate fallback source of truth. >> Any >> > > > additional configured >> > > > backends are prioritized higher than it by design. >> > > > >> > > > With your changes, we now allow configurations like: >> > > > >> > > > >> > > > >> > > > * @conf_vars({("secrets", "backends_order"): >> > > > "metastore,environment_variable,unsupported"}) def >> > > > test_backends_order_unsupported(self): with >> > > > pytest.raises(AirflowConfigException): >> > > ensure_secrets_loaded()* >> > > > >> > > > I don’t fully understand the motivation behind supporting this >> level of >> > > > override, especially since it >> > > > could allow unsupported or unintended configurations. Additionally, >> > with >> > > > Airflow 3.0+, we already support >> > > > a multi layered secret backend resolution capability with the >> > > introduction >> > > > of secrets backend for workers. >> > > > Order goes as: >> > > > >> > > > *secrets backend on worker directly (optional) > env vars on worker >> > * >> > > > *reach out to api server [secrets backend defined here (optional) > >> env >> > > > vars on api server > metadata DB].* >> > > > >> > > > You will have to consider this angle too. >> > > > >> > > > In my opinion, a more practical and realistic use case would be to >> have >> > > the >> > > > ability to define multiple custom backends >> > > > both on worker or the API server. >> > > > >> > > > Looking forward to hearing more from you. >> > > > >> > > > Thanks & Regards, >> > > > Amogh Desai >> > > > >> > > > >> > > > On Wed, Jul 2, 2025 at 3:59 PM Anton Nitochkin < >> > ant.nitoch...@gmail.com> >> > > > wrote: >> > > > >> > > >> Hello, >> > > >> >> > > >> I'd like to discuss a new option that can be added via this PR: >> > > >> https://github.com/apache/airflow/pull/45931. >> > > >> >> > > >> Recently, I asked developers in Slack for their thoughts on the new >> > > >> variable [secrets]backend_order. Long story short: this option will >> > > >> introduce the ability to configure the backend order and control it >> > > using >> > > >> this variable. The default value will remain the same as in the >> > current >> > > >> version, so for users who don't need it, things will stay as they >> are >> > > now. >> > > >> >> > > >> Jarek Potiuk advised starting a conversation and discussing the PR >> to >> > > reach >> > > >> a consensus with the community. >> > > >> >> > > >> Can you please share your thoughts on the option and its >> > implementation? >> > > >> >> > > >> Anton Nitochkin >> > > >> >> > > >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> > > For additional commands, e-mail: dev-h...@airflow.apache.org >> > > >> > > >> > >> >