potiuk commented on PR #35210:
URL: https://github.com/apache/airflow/pull/35210#issuecomment-1830836013
>UPDATE: Okay I was wrong. Just tested with an example DAG and if you use
the module name <dagfile>.<class> then it is possible to inject code from the
DAG directly.
yep. it is
> If the dags folder is added to PYTHONPATH, then yes, but I need to check
if there is a protection for dag file processor, which processes these files
and sometimes it's running in the scheduler service.
Nope. There is no protection. Whatever is in PYTHONPATH can be used ... and
.. airflow will AUTOMATICALLY add `dags` folder to PYTHONPATH too.
Some more context on that one.
Historically, when dag file processor was not standalone this was even more
important.
This is also due to historical reasons. DAGFileProcessor in likely 9X* of
airflow installations is not "standalone". It is a newly forked process, so it
is not really the "same" context as scheduler - those are different proceses.
But they share everything else (memory, filesystem and they use the same
PYTHONPATH - there is only one process to set the original PYTHONPATH to
(`airflow scheduler`). So that basically means that both `airflow scheduler
process` as well as `dag file processor subprocess` have to have access to the
same PYTHONPATH and DAG folder will be on the PYTHONPATH by definition. This is
the reason why we cannot let scheduler do`import("arbitrary import provided by
DAG author")`
It's only after `standalone dag procesor` wher we can (and plan to announce
it more prominently when ready` to actually isolate scheduler from DAG folder
and user code. once we have it, it's theorethically possible to run `airflow
scheduler` without scheduler even SEEING DAG folder. Simply speaking when
standalone dag file processor is configured, `airflow scheduler` is essentailly
DAG-less. This is far more secure setup (but almost no-one uses it yet), and it
will a base for multi-tenancy separation. And then, the `plugin` option is a
bit less important (but still quite important because DAG author could make
airflow import ANY Python package. Which we do not want to do because it could
later allow crossing boundaries between tenants.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]