Sure. Then I am fine with removing it now. We can always re-add it before
now and later if we see a need. Or maybe it will just be added
"automatically " if we decide that context is the right way to pass the
information.

On Tue, Dec 10, 2024 at 7:43 PM Kaxil Naik <kaxiln...@gmail.com> wrote:

> Regardless of any option, there isn't a strong case to keep it in the
> Context dict -- which is why I want to separate the discussion of configs
> to worker/parser etc to https://github.com/apache/airflow/issues/44352
>
> On Wed, 11 Dec 2024 at 00:04, Kaxil Naik <kaxiln...@gmail.com> wrote:
>
> > imo it is related but a separate discussion: the focus of this one is
> very
> > limited to only the Task Context dictionary. This is actually good since
> > there will be only one way to get configuration i.e. importing
> > airflow.configuration object instead of this other way -- more so because
> > there is still some way to go in determining (2) or (3)
> >
> > For Task SDK: we have a spike task about it here:
> > https://github.com/apache/airflow/issues/44352. I have copy/pasted your
> > three options there.
> >
> > On Tue, 10 Dec 2024 at 23:57, Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> >> I think that one should be preceded with another decision which we have
> >> not
> >> discussed and have not agreed to (and a bit of follow-up on Constance's
> >> question).
> >>
> >> Do we expect airflow configuration to be present also at the worker and
> >> should it be the same or different than the one in
> >> "scheduler/webserver/triggerer" ?
> >>
> >> So far (Airflow 2) we assumed (and even recommended to the users -
> >>
> >>
> https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html
> >> )
> >> that configuration should be the same for all components.
> >>
> >> Which is not good for security reasons because we have sensitive
> >> configuration, and users should not copy around those  config files and
> >> variables.
> >> IMHO we should completely remove this recommendation. And we have three
> >> possible options to tell our users:.
> >>
> >> 1) either copy to workers only the configuration that is needed (but
> which
> >> ones are needed and how to separate those config entries in the docs?)
> >> 2) or NO configuration should be present in the worker - and all
> >> configuration that is needed should be passed by Task SDK to the worker
> >> 3) or we assess that task and dag file processors do not need any
> >> configuration at all
> >>
> >> I think if we assess that 3) is the case - there is no other option but
> to
> >> remove conf. That would also be really nice as we would not have to have
> >> "conf" reading code in TaskSDK at all.
> >> But if we think that tasks and dag parsing will actually need some
> >> configuration, then we can go either 1) or 2) route
> >>
> >> I think option 2) is much better from less confusion, security and
> "single
> >> source of truth" point of view. And if we go for 2) then `namespace =
> >> conf.get("kubernetes", "NAMESPACE")` will not work because configuration
> >> will not be present there and in this case what would make more sense is
> >> to
> >> leave "conf" as part of the context but only pass the "needed"
> >> configuration there.
> >> If we go for option 1) then likely yes we could remove it from context.
> >>
> >> J.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Dec 10, 2024 at 7:03 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >>
> >> > Yeah, this PR only remove it from accessing from task context dict
> i.e.
> >> if
> >> > you try to do: BashOperator(bash_command="{{ conf.get('webserver',
> >> > 'base_url') }}" -- for example
> >> >
> >> > On Tue, 10 Dec 2024 at 23:27, Constance Martineau
> >> > <consta...@astronomer.io.invalid> wrote:
> >> >
> >> > > Will it still be possible to import and use configuration directly?
> >> For
> >> > > example:
> >> > >
> >> > > ```
> >> > > from airflow.configuration import conf
> >> > > from airflow.providers.cncf.kubernetes.operators.kubernetes_pod
> import
> >> > > KubernetesPodOperator
> >> > >
> >> > >
> >> > > namespace = conf.get("kubernetes", "NAMESPACE")
> >> > >
> >> > > KubernetesPodOperator(
> >> > >     namespace=namespace,
> >> > >     image="<your-docker-image>",
> >> > >     cmds=["<commands-for-image>"],
> >> > >     arguments=["<arguments-for-image>"],
> >> > >     labels={"<pod-label>": "<label-name>"},
> >> > >     name="<pod-name>",
> >> > >     task_id="<task-name>",
> >> > >     get_logs=True,
> >> > >     in_cluster=True,
> >> > > )
> >> > > ```
> >> > >
> >> > > On Tue, Dec 10, 2024 at 12:52 PM Kaxil Naik <kaxiln...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Hey team,
> >> > > >
> >> > > > I have a PR (https://github.com/apache/airflow/pull/44820) to
> >> remove
> >> > > conf
> >> > > > object from the Task context dictionary.
> >> > > >
> >> > > > This was initially added (in 2015) in response to
> >> > > > https://github.com/apache/airflow/issues/168. However, we now
> have
> >> > > > `ti.log_url` that is used for that; example usages:
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/airflow/models/taskinstance.py#L1362
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/providers/src/airflow/providers/smtp/notifications/templates/email.html#L28
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/docs/apache-airflow/howto/email-config.rst?plain=1#L76
> >> > > >
> >> > > > So, to simplify what we need to pass from the API server to the
> Task
> >> > SDK
> >> > > > (for AF 3.0), I want to simplify and remove things that aren't
> >> needed.
> >> > In
> >> > > > this case, this is good so we don't pass/expose secrets
> >> unnecessarily
> >> > via
> >> > > > `conf` object in the execution environment.
> >> > > >
> >> > > > To that end, I am calling for a LAZY CONSENSUS to remove it. If no
> >> one
> >> > > > objects, the vote will pass three days from now, at 6:30 pm UTC,
> 13
> >> Dec
> >> > > > 2024.
> >> > > >
> >> > > > Regards,
> >> > > > Kaxil
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to