Hi Kaxil, Thanks for the comment. The serialized_dag isn't used to run the task in the `airflow run --raw` process. It is used in the `airflow run --local` to perform `check_and_change_state_before_execution` https://github.com/apache/airflow/blob/main/airflow/jobs/local_task_job.py#L88-L99
Thanks, Ping On Mon, Dec 20, 2021 at 4:51 AM Kaxil Naik <[email protected]> wrote: > Yup, forking only applies when os.fork is available and run_as_user isn't > specified. We had only added enough details in Serialized DAGs that are > needed for the Webserver and to make any scheduling decisions in the > Scheduler. > > So it does not contain all the information (all the args, kwargs including > callables) required to run the task. > > Looking forward for the AIP. > > Regards, > Kaxil > > On Fri, Dec 17, 2021 at 11:04 PM Ping Zhang <[email protected]> wrote: > >> Hi Ash, >> >> Thanks for the inputs about the fork approach. I have checked the code. >> The fork only applies when there is no run_as_user. I think the run_as_user >> is an important feature. >> >> I will create an AIP with more details. >> >> Best wishes >> >> Ping Zhang >> >> >> On Fri, Dec 17, 2021 at 9:59 AM Jarek Potiuk <[email protected]> wrote: >> >>> Yeah. I would also love to see some details in the meeting I proposed >>> :). I am particularly interested about the current limitation of the >>> solution in "general" case. >>> >>> J, >>> >>> On Fri, Dec 17, 2021 at 11:16 AM Ash Berlin-Taylor <[email protected]> >>> wrote: >>> > >>> > On Thu, Dec 16 2021 at 16:19:45 -0800, Ping Zhang <[email protected]> >>> wrote: >>> > >>> > To run airflow tasks, airflow needs to parse dag file twice, once in >>> airflow run local process, once in airflow run raw >>> > >>> > >>> > This isn't true in most cases anymore thanks to a change from spawning >>> a new process (os.exec(["airflow",...]) to fork instead. >>> > >>> > The serialized_dag table doesn't (currently) contain enough >>> information to actually execute every dag, especially in the case of >>> PythonOperator, so the actual dag file on disk needs to be loaded to get >>> code to run, so perhaps it would be possible to do this for some operators, >>> but not all. >>> > >>> > Still might be worth looking at it and I'm looking forward to the >>> proposal! >>> > >>> > -ash >>> >>
