A DagRun created before this DAG discovered by scheduler

2018-05-17 Thread Song Liu
Hi, I am developing the API based on Airflow for specific project purpose, that I meet one situation that: - A new Dag is added into Airflow via. `/dag/create` API (add this Dag file into DAGS_FOLDER) - Start to run this Dag via. `/dag/run` API (actually create a new DagRun) If the `run` API

答复: How Airflow import modules as it executes the tasks

2018-05-15 Thread Song Liu
I think there might be two ways: 1. Setup the connections via. the Airflow UI: http://airflow.readthedocs.io/en/latest/configuration.html#connections, I guess this could be done in your code also. 2. Put your connection setup into a operator at the begin of your dag

答复: 答复: How to know the DAG is starting to run

2018-05-14 Thread Song Liu
DAG is starting to run I would question that hooking into DAG.run is "more gracefully" than having a root task node that does the pipeline environment setup. IMO it'd be easier and cleaner to catch setup errors when it's done in a separate task. Best regards, Jiening -----

About how to let scheduler discover right now after a DAG added

2018-05-13 Thread Song Liu
Hi, Since the scheduler discovering is timer based, but if I have added a new DAG into dags_folder, is that possible to call scheduler API to discover the dags_folder immediately (in my plugin) ? In summary about the scheduler: - when to discover ? besides the time interval, might support

答复: About the DAG discovering not synced between scheduler and webserver

2018-05-13 Thread Song Liu
s from metadb, is there any requirement or design consideration besides this ? Many thanks for any information. Thanks, Song From: Song Liu <song...@outlook.com> Sent: Saturday, May 12, 2018 7:58:43 PM To: dev@airflow.incubator.apache.org Subject: A

About the DAG discovering not synced between scheduler and webserver

2018-05-12 Thread Song Liu
Hi, When add a new dag, sometimes we can see: ``` This DAG isn't available in the web server's DagBag object. It shows up in this list because the scheduler marked it as active in the metadata database. ``` In the views.py, it will collect DAGs under "DAGS_FOLDER" by instantiate a DagBag

答复: 答复: How to know the DAG is starting to run

2018-05-12 Thread Song Liu
, it would help others understand the problem you are trying to solve. Chris On Fri, May 11, 2018 at 10:59 AM, Song Liu <song...@outlook.com> wrote: > Overriding the "DAG.run" sounds like a workaround, so that if it's running > a first operation of DAG then do some setup etc

答复: How to know the DAG is starting to run

2018-05-11 Thread Song Liu
want to override. You may want to call the original "super().run()" then do what you need to do afterwards. Let's see if that works for you. > On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote: > > Yes, I have though this approach, but more elegant wa

答复: Airflow REST API proof of concept.

2018-05-11 Thread Song Liu
So that this Java REST API server is talking to the meta db directly ? 发件人: Luke Diment 发送时间: 2018年5月11日 12:22 收件人: dev@airflow.incubator.apache.org 主题: Fwd: Airflow REST API proof of concept. FYI. Sent from my iPhone Begin forwarded

答复: How to know the DAG is starting to run

2018-05-11 Thread Song Liu
on stock market trading days. -James M. On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote: > Hi, > > I have something just want to be done only once when DAG is constructed, > but it seems that DAG will be instanced every time when run each of > operator.

答复: Define folder for task of dag

2018-05-11 Thread Song Liu
It seems that this temporary folder name can't be got. The folder you saw is a TemporaryFile created to hold your bash command and run it by BashOperator, I think you could have your own working space and run your logic there by simply "cd {your_work_space}; do something".

How to know the DAG is starting to run

2018-05-11 Thread Song Liu
Hi, I have something just want to be done only once when DAG is constructed, but it seems that DAG will be instanced every time when run each of operator. So is that there function in DAG that tell us it is starting to run now ? Thanks, Song

答复: About the difference between xcom and variable

2018-05-10 Thread Song Liu
It seems that variable doesn't remove the duplication if have. Anyway XCom.get / XCom.set looks like more flexible and it meets the requirement. Thanks, Song 发件人: Song Liu <song...@outlook.com> 发送时间: 2018年5月9日 7:22 收件人: dev@airflow.incubator.apache.org 主题:

Interesting things about how to know it's a DAG file

2018-05-10 Thread Song Liu
Hi, I just create a custom Dag class naming such as "MyPipeline" by extending the "DAG" class, but Airflow is failed to identify this is a DAG file. After digging into the Airflow implementation around the dag_processing.py file: ``` # Heuristic that guesses whether a Python file contains an #

About the difference between xcom and variable

2018-05-09 Thread Song Liu
Hi, For the cross-task communication, there are two options: - xcom - variable if I have make sure the key of variable is global unique (such uuid), is there any other limitations or cons for using variable instead of xcom ? since that xcom should specify the task_id which is used not very

About how to pause the running task

2018-04-26 Thread Song Liu
Hi, A DAG is composed of many tasks, when this DAG is started, how to pause the current running task ? Thanks, Song

About the project support in Airflow

2018-04-24 Thread Song Liu
Hi, Basically the DAGs are created for a project purpose, so if I have many different projects, will the Airflow support the Project concept and organize them separately ? Is this a known requirement or any plan for this already ? Thanks, Song

答复: 答复: About multi user support in airflow

2018-02-27 Thread Song Liu
主题: Re: 答复: About multi user support in airflow @song, you might also be interested in the work that Joy Gao is doing to add more access controls to the UI: https://github.com/wepay/airflow-webserver On Thu, Feb 1, 2018 at 8:12 PM, Song Liu <song...@outlook.com> wrote: > Hi Trent, >

答复: About multi user support in airflow

2018-02-01 Thread Song Liu
scheduler will do ? will it parse the new added dag only ? Thanks again. Thanks, Song 发件人: Trent Robbins <robbi...@gmail.com> 发送时间: 2018年2月2日 4:19 收件人: Song Liu 抄送: dev@airflow.incubator.apache.org 主题: Re: About multi user support in airflow Hi Song

About multi user support in airflow

2018-02-01 Thread Song Liu
Hi, In production, multi-user support is needed, which means that every user could login the airflow platform, and could manage their own dags separately. But currently it seem that all the dags are managed into a single dags folder, so for multi-user support what I need to do ? Many thanks

How to trigger the dag to run immediately

2018-02-01 Thread Song Liu
Hi, It seems that airflow scheduler is based on the start_date and interval, but is that possible to be triggered only by user request on demand immediately ? such as when user click a button "Run" then this dag will be started right now and be stopped when finished. Thanks for any