Sorry for typos - that was my mobile auto complete... I hope it is
understandable anyway

czw., 3 lip 2025, 11:13 użytkownik Jarek Potiuk <ja...@potiuk.com> napisał:

>
>
>
> czw., 3 lip 2025, 10:14 użytkownik Amogh Desai <amoghdesai....@gmail.com>
> napisał:
>
>> Thanks for that angle, Jarek.
>>
>> Lets say DB lookup has higher precedence than that of say ENV backend.
>> Wouldn't this be shooting ourselves in the foot by compromising the
>> performance here? DB lookup
>> will be more expensive than DB.
>>
>>
> Oh absolutely. I think if we have this possibility of managing order those
> kind of scenarios alshould be explained in the docs so that users do not
> shoot themselves in a foot
>
> Also following my mail about multi team. I started to think recently -
> looking at some other OSS software thetwe sometimes take too much
> responsibility for our users and the snuffer be cause we have to defend out
> opinionated choices when there are use cases that outlet choices do not
> enable.
>
> This is the reason why we have so many 'options' and config values because
> sometimes we do not want to make decisions for our users - but where we can
> make it an option and configuration and clearly explain to o lut users (and
> mostly I am talking about Deployment Manager role from our security model).
> - it's their responsibility to read all the information we provide and
> follow it when they make decisions on how to configure Airflow - knowing
> the consequences. And we should be 'harsh' with them - in the sense that if
> they did not read the docs and did not understand it - any time they ask
> imus about something not working that is explained in the docs - we should
> send them to the doc with 'Read The Friendly Manual' advice - simply
> because this is the only job they have. And we should not do the job for
> them.
>
> Similarly having operations like that allow our managed service providers
> to make their opinionated choices and make some configuration options
> possible, some selected for their users in the context of the service
> managed. But again - that's their responsibility to manage and understand
> what are the options and what they mean. Same as individual deployment
> managers - they can make their own decisions - and if it does not cost us a
> lot we should make it possible for them to make those choices (and take
> responsibility for their choices)
>
> With great powers (of choice) you also have great responsibilities (of
> consequences of your choices) - and as long we are aware of those
> consequences and communicate it to deployment managers - it's on their
> shoulders to make the choices and bear the consequences.
>
> J.
>
>
>
> There could also be a few more side effects that we will have to fully
>> uncover and come up
>> with a detailed plan to allow this to be configurable.
>>
>> Thanks & Regards,
>> Amogh Desai
>>
>>
>> On Wed, Jul 2, 2025 at 6:43 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > I think this is a good idea - but as Ash mentioned, it has to be
>> executed
>> > well with a lot of bells and whistles, so that users will not shoot
>> > themselves in their foot. For example we had recently discussions on the
>> > new UI whether/how to explain the users that their connections in UI and
>> > API **only** show the DB connections (for good reasons) - and it is
>> already
>> > difficult to explain to the users, now - this change will also make it
>> > behave differently (for example - currently when you edit connection
>> via UI
>> > it might **not** get into effect if you have same connection defined in
>> the
>> > secret/env var. But if you make DB first - this changes and there are
>> few
>> > edge-cases where it might have some unexpected effect.
>> >
>> > But there is one inevitable benefit of this approach that I like - the
>> > ability of turning airflow DB into an effective "shield" for secret
>> usage.
>> > The big drawback of the current "sequence" is that airflow generates a
>> LOT
>> > of queries to Secrets' manager, even if your connection is defined in
>> the
>> > DB - because it will query secrets first. So currently it is not
>> possible
>> > to say "for this, highly frequently used connection I want to keep it
>> in DB
>> > to save on the secret's manager queries - both performance and cost
>> wise -
>> > because defining connection in the DB does not limit the number of
>> secret
>> > manager's queries. So in a number of scenarios, being able to revert it
>> and
>> > query DB first might be very good for cost and network optimisation.
>> >
>> > I think if we describe it (as Ash wrote) well in the docs and explain
>> those
>> > scenarios and also clearly communicate it in the UI if Airflow (we need
>> to
>> > likely have some way of explaining the user what is their currently
>> > configured sequence and what they should expect to happen if they
>> > remove/add connection) - then I see it as a really useful feature.
>> >
>> > J.
>> >
>> > On Wed, Jul 2, 2025 at 2:54 PM Ash Berlin-Taylor <a...@apache.org>
>> wrote:
>> >
>> > > At a high level I’m good with allowing this to be fully configurable,
>> as
>> > > long as we document the possible warts (“Doctor, it hurts when I do
>> this”
>> > > “well don’t do that then!” etc) — though as Amogh mentioned it is
>> > slightly
>> > > complicated by the distinction between API Server/Scheduler and the
>> > > execution time on the worker.
>> > >
>> > > (I haven’t looked at the specific implementation yet)
>> > >
>> > > -ash
>> > >
>> > > > On 2 Jul 2025, at 11:56, Amogh Desai <amoghdesai....@gmail.com>
>> wrote:
>> > > >
>> > > > Hello Anton,
>> > > >
>> > > > Thanks for kicking off this discussion. I’d love to understand your
>> > > > motivations a bit more on this front.
>> > > > From your PR, I am seeing that you are just not allowing addition of
>> > > > multiple custom backends
>> > > > but also changing the *default_backend* order. I am a bit torn on
>> that
>> > > > part.
>> > > >
>> > > > The current design intentionally places the metadata DB backend at
>> the
>> > > > lowest precedence in the order,
>> > > > since it’s meant to serve as the ultimate fallback source of truth.
>> Any
>> > > > additional configured
>> > > > backends are prioritized higher than it by design.
>> > > >
>> > > > With your changes, we now allow configurations like:
>> > > >
>> > > >
>> > > >
>> > > > *    @conf_vars({("secrets", "backends_order"):
>> > > > "metastore,environment_variable,unsupported"})    def
>> > > > test_backends_order_unsupported(self):        with
>> > > > pytest.raises(AirflowConfigException):
>> > > ensure_secrets_loaded()*
>> > > >
>> > > > I don’t fully understand the motivation behind supporting this
>> level of
>> > > > override, especially since it
>> > > > could allow unsupported or unintended configurations. Additionally,
>> > with
>> > > > Airflow 3.0+, we already support
>> > > > a multi layered secret backend resolution capability with the
>> > > introduction
>> > > > of secrets backend for workers.
>> > > > Order goes as:
>> > > >
>> > > > *secrets backend on worker directly (optional) > env vars on worker
>> > *
>> > > > *reach out to api server [secrets backend defined here (optional) >
>> env
>> > > > vars on api server > metadata DB].*
>> > > >
>> > > > You will have to consider this angle too.
>> > > >
>> > > > In my opinion, a more practical and realistic use case would be to
>> have
>> > > the
>> > > > ability to define multiple custom backends
>> > > > both on worker or the API server.
>> > > >
>> > > > Looking forward to hearing more from you.
>> > > >
>> > > > Thanks & Regards,
>> > > > Amogh Desai
>> > > >
>> > > >
>> > > > On Wed, Jul 2, 2025 at 3:59 PM Anton Nitochkin <
>> > ant.nitoch...@gmail.com>
>> > > > wrote:
>> > > >
>> > > >> Hello,
>> > > >>
>> > > >> I'd like to discuss a new option that can be added via this PR:
>> > > >> https://github.com/apache/airflow/pull/45931.
>> > > >>
>> > > >> Recently, I asked developers in Slack for their thoughts on the new
>> > > >> variable [secrets]backend_order. Long story short: this option will
>> > > >> introduce the ability to configure the backend order and control it
>> > > using
>> > > >> this variable. The default value will remain the same as in the
>> > current
>> > > >> version, so for users who don't need it, things will stay as they
>> are
>> > > now.
>> > > >>
>> > > >> Jarek Potiuk advised starting a conversation and discussing the PR
>> to
>> > > reach
>> > > >> a consensus with the community.
>> > > >>
>> > > >> Can you please share your thoughts on the option and its
>> > implementation?
>> > > >>
>> > > >> Anton Nitochkin
>> > > >>
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> > > For additional commands, e-mail: dev-h...@airflow.apache.org
>> > >
>> > >
>> >
>>
>

Reply via email to