On Thu, Dec 16 2021 at 16:19:45 -0800, Ping Zhang <[email protected]>
wrote:
To run airflow tasks, airflow needs to parse dag file twice, once in
airflow run local process, once in airflow run raw
This isn't true in most cases anymore thanks to a change from spawning
a new process (os.exec(["airflow",...]) to fork instead.
The serialized_dag table doesn't (currently) contain enough information
to actually execute every dag, especially in the case of
PythonOperator, so the actual dag file on disk needs to be loaded to
get code to run, so perhaps it would be possible to do this for some
operators, but not all.
Still might be worth looking at it and I'm looking forward to the
proposal!
-ash