Hi,
I am developing the API based on Airflow for specific project purpose, that I
meet one situation that:
- A new Dag is added into Airflow via. `/dag/create` API (add this Dag file
into DAGS_FOLDER)
- Start to run this Dag via. `/dag/run` API (actually create a new DagRun)
If the `run` API
I think there might be two ways:
1. Setup the connections via. the Airflow UI:
http://airflow.readthedocs.io/en/latest/configuration.html#connections, I guess
this could be done in your code also.
2. Put your connection setup into a operator at the begin of your dag
DAG is starting to run
I would question that hooking into DAG.run is "more gracefully" than having a
root task node that does the pipeline environment setup.
IMO it'd be easier and cleaner to catch setup errors when it's done in a
separate task.
Best regards,
Jiening
-----
Hi,
Since the scheduler discovering is timer based, but if I have added a new DAG
into dags_folder, is that possible to call scheduler API to discover the
dags_folder immediately (in my plugin) ?
In summary about the scheduler:
- when to discover ? besides the time interval, might support
s from metadb, is there any requirement or design
consideration besides this ?
Many thanks for any information.
Thanks,
Song
From: Song Liu <song...@outlook.com>
Sent: Saturday, May 12, 2018 7:58:43 PM
To: dev@airflow.incubator.apache.org
Subject: A
Hi,
When add a new dag, sometimes we can see:
```
This DAG isn't available in the web server's DagBag object. It shows up in this
list because the scheduler marked it as active in the metadata database.
```
In the views.py, it will collect DAGs under "DAGS_FOLDER" by instantiate a
DagBag
, it would help others understand
the problem you are trying to solve.
Chris
On Fri, May 11, 2018 at 10:59 AM, Song Liu <song...@outlook.com> wrote:
> Overriding the "DAG.run" sounds like a workaround, so that if it's running
> a first operation of DAG then do some setup etc
want to override. You may
want to call the original "super().run()" then do what you need to do
afterwards.
Let's see if that works for you.
> On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote:
>
> Yes, I have though this approach, but more elegant wa
So that this Java REST API server is talking to the meta db directly ?
发件人: Luke Diment
发送时间: 2018年5月11日 12:22
收件人: dev@airflow.incubator.apache.org
主题: Fwd: Airflow REST API proof of concept.
FYI.
Sent from my iPhone
Begin forwarded
on stock market trading days.
-James M.
On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:
> Hi,
>
> I have something just want to be done only once when DAG is constructed,
> but it seems that DAG will be instanced every time when run each of
> operator.
It seems that this temporary folder name can't be got.
The folder you saw is a TemporaryFile created to hold your bash command and run
it by BashOperator, I think you could have your own working space and run your
logic there by simply "cd {your_work_space}; do something".
Hi,
I have something just want to be done only once when DAG is constructed, but it
seems that DAG will be instanced every time when run each of operator.
So is that there function in DAG that tell us it is starting to run now ?
Thanks,
Song
It seems that variable doesn't remove the duplication if have.
Anyway XCom.get / XCom.set looks like more flexible and it meets the
requirement.
Thanks,
Song
发件人: Song Liu <song...@outlook.com>
发送时间: 2018年5月9日 7:22
收件人: dev@airflow.incubator.apache.org
主题:
Hi,
I just create a custom Dag class naming such as "MyPipeline" by extending the
"DAG" class, but Airflow is failed to identify this is a DAG file.
After digging into the Airflow implementation around the dag_processing.py file:
```
# Heuristic that guesses whether a Python file contains an #
Hi,
For the cross-task communication, there are two options:
- xcom
- variable
if I have make sure the key of variable is global unique (such uuid), is there
any other limitations or cons for using variable instead of xcom ? since that
xcom should specify the task_id which is used not very
Hi,
A DAG is composed of many tasks, when this DAG is started, how to pause the
current running task ?
Thanks,
Song
Hi,
Basically the DAGs are created for a project purpose, so if I have many
different projects, will the Airflow support the Project concept and organize
them separately ?
Is this a known requirement or any plan for this already ?
Thanks,
Song
主题: Re: 答复: About multi user support in airflow
@song, you might also be interested in the work that Joy Gao is doing to
add more access controls to the UI:
https://github.com/wepay/airflow-webserver
On Thu, Feb 1, 2018 at 8:12 PM, Song Liu <song...@outlook.com> wrote:
> Hi Trent,
>
scheduler will do ? will it parse the new added dag only ?
Thanks again.
Thanks,
Song
发件人: Trent Robbins <robbi...@gmail.com>
发送时间: 2018年2月2日 4:19
收件人: Song Liu
抄送: dev@airflow.incubator.apache.org
主题: Re: About multi user support in airflow
Hi Song
Hi,
In production, multi-user support is needed, which means that every user could
login the airflow platform, and could manage their own dags separately.
But currently it seem that all the dags are managed into a single dags folder,
so for multi-user support what I need to do ?
Many thanks
Hi,
It seems that airflow scheduler is based on the start_date and interval, but is
that possible to be triggered only by user request on demand immediately ? such
as when user click a button "Run" then this dag will be started right now and
be stopped when finished.
Thanks for any
21 matches
Mail list logo