I think that one should be preceded with another decision which we have not
discussed and have not agreed to (and a bit of follow-up on Constance's
question).

Do we expect airflow configuration to be present also at the worker and
should it be the same or different than the one in
"scheduler/webserver/triggerer" ?

So far (Airflow 2) we assumed (and even recommended to the users -
https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html)
that configuration should be the same for all components.

Which is not good for security reasons because we have sensitive
configuration, and users should not copy around those  config files and
variables.
IMHO we should completely remove this recommendation. And we have three
possible options to tell our users:.

1) either copy to workers only the configuration that is needed (but which
ones are needed and how to separate those config entries in the docs?)
2) or NO configuration should be present in the worker - and all
configuration that is needed should be passed by Task SDK to the worker
3) or we assess that task and dag file processors do not need any
configuration at all

I think if we assess that 3) is the case - there is no other option but to
remove conf. That would also be really nice as we would not have to have
"conf" reading code in TaskSDK at all.
But if we think that tasks and dag parsing will actually need some
configuration, then we can go either 1) or 2) route

I think option 2) is much better from less confusion, security and "single
source of truth" point of view. And if we go for 2) then `namespace =
conf.get("kubernetes", "NAMESPACE")` will not work because configuration
will not be present there and in this case what would make more sense is to
leave "conf" as part of the context but only pass the "needed"
configuration there.
If we go for option 1) then likely yes we could remove it from context.

J.








On Tue, Dec 10, 2024 at 7:03 PM Kaxil Naik <kaxiln...@gmail.com> wrote:

> Yeah, this PR only remove it from accessing from task context dict i.e. if
> you try to do: BashOperator(bash_command="{{ conf.get('webserver',
> 'base_url') }}" -- for example
>
> On Tue, 10 Dec 2024 at 23:27, Constance Martineau
> <consta...@astronomer.io.invalid> wrote:
>
> > Will it still be possible to import and use configuration directly? For
> > example:
> >
> > ```
> > from airflow.configuration import conf
> > from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import
> > KubernetesPodOperator
> >
> >
> > namespace = conf.get("kubernetes", "NAMESPACE")
> >
> > KubernetesPodOperator(
> >     namespace=namespace,
> >     image="<your-docker-image>",
> >     cmds=["<commands-for-image>"],
> >     arguments=["<arguments-for-image>"],
> >     labels={"<pod-label>": "<label-name>"},
> >     name="<pod-name>",
> >     task_id="<task-name>",
> >     get_logs=True,
> >     in_cluster=True,
> > )
> > ```
> >
> > On Tue, Dec 10, 2024 at 12:52 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >
> > > Hey team,
> > >
> > > I have a PR (https://github.com/apache/airflow/pull/44820) to remove
> > conf
> > > object from the Task context dictionary.
> > >
> > > This was initially added (in 2015) in response to
> > > https://github.com/apache/airflow/issues/168. However, we now have
> > > `ti.log_url` that is used for that; example usages:
> > >
> > >
> > >
> >
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/airflow/models/taskinstance.py#L1362
> > >
> > >
> > >
> >
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/providers/src/airflow/providers/smtp/notifications/templates/email.html#L28
> > >
> > >
> > >
> >
> https://github.com/apache/airflow/blob/dcd41f60f1c9b5583b49bfb49b6d85c640a2892c/docs/apache-airflow/howto/email-config.rst?plain=1#L76
> > >
> > > So, to simplify what we need to pass from the API server to the Task
> SDK
> > > (for AF 3.0), I want to simplify and remove things that aren't needed.
> In
> > > this case, this is good so we don't pass/expose secrets unnecessarily
> via
> > > `conf` object in the execution environment.
> > >
> > > To that end, I am calling for a LAZY CONSENSUS to remove it. If no one
> > > objects, the vote will pass three days from now, at 6:30 pm UTC, 13 Dec
> > > 2024.
> > >
> > > Regards,
> > > Kaxil
> > >
> >
>

Reply via email to