Re: DISCUSS: How to create and discovery airflow dag?

Ash Berlin-Taylor Thu, 06 May 2021 04:03:35 -0700

The answer to both questions is "because the DAGs are python files" and"because that's how it is now/we haven't written it yet".

Historically Airflow needed the actual python code in the DAGs to doanything (show them in the UI, schedule them or execute them), but withAirflow 2.0 and DAG serialization becoming mandatory the UI no longerneeds the files, and the "main" scheduler doesn't either, but the DAGparsing process still requires DAGs on disk, and the actual taskexecution will always need DAG files.

The main reason execution needs DAG files is to support Pythonoperators (calling python functions defined in your DAG) or customoperators, which could also be defined in disk.

We could extend Airflow to support "submitting" DAGs via an API withthe condition that no python operator, and no custom operators areused. Or python operator could work so long as there is no closure oradvanced scope etc. But then we have to start to worry about all theedge cases and the security of the API becomes _much_ more important.


In short, because it's complicated and has some nasty edge cases.

We'll likely get there eventually.

-ash

On Thu, May 6 2021 at 16:22:28 +0800, 落雨留音<[email protected]> wrote:

1. Why does airflow dag not support reading directly from db, butreading from a local fileThe current way of discovering dag is scan local files and thensynchronize to db. If I want to create a dag, I need to create a dagfile in the scheduler dags_folder, and then synchronize the dag fileto the web and worker. Why can't I store dag file to the db directly?then web, scheduler, and worker all obtain the dag file through thedb?
2. Why is there no createDag api
Why is there no api to create dag? as long as I call the api, daginformation can be synchronized to db and the local files of web,scheduler and worker?

Re: DISCUSS: How to create and discovery airflow dag?

Reply via email to