A DagRun created before this DAG discovered by scheduler

2018-05-17 Thread Song Liu
Hi,

I am developing the API based on Airflow for specific project purpose, that I 
meet one situation that:

- A new Dag is added into Airflow via. `/dag/create` API (add this Dag file 
into DAGS_FOLDER)

- Start to run this Dag via. `/dag/run` API (actually create a new DagRun)

If the `run` API is called after `create` API immediately, suppose the `run` 
API is working until the scheduler has discovered this Dag.
But from my result it shows that this new DagRun could be running finally even 
if it's created before DAG discovered by scheduler.

Is this matched with current scheduler behavior ?

Thanks,
Song


答复: How Airflow import modules as it executes the tasks

2018-05-15 Thread Song Liu
I think there might be two ways:

1. Setup the connections via. the Airflow UI: 
http://airflow.readthedocs.io/en/latest/configuration.html#connections, I guess 
this could be done in your code also.

2. Put your connection setup into a operator at the begin of your dag


发件人: alireza.khoshkbari@ 
发送时间: 2018年5月16日 1:21
收件人: d...@airflow.apache.org
主题: How Airflow import modules as it executes the tasks

To start off, here is my project structure:
├── dags
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── operators
│   │   │   ├── __init__.py
│   │   │   ├── first_operator.py
│   │   └── util
│   │   ├── __init__.py
│   │   ├── db.py
│   ├── my_dag.py

Here is the versions and details of the airflow docker setup:

In my dag in different tasks I'm connecting to db (not Airflow db). I've setup 
db connection pooling,  I expected that my db.py would be be loaded once across 
the DagRun. However, in the log I can see that each task imports the module and 
new db connections made by each and every task. I can see that db.py is loaded 
in each task by having the line below in db.py:

logging.info("I was loaded {}".format(random.randint(0,100)))

I understand that each operator can technically be run in a separate machine 
and it does make sense that each task runs sort of independently. However, not 
sure that if this does apply in case of using LocalExecutor. Now the question 
is, how I can share the resources (db connections) across tasks using 
LocalExecutor.


答复: 答复: How to know the DAG is starting to run

2018-05-14 Thread Song Liu
A dedicated task at front could solve this pipeline level environment setup.

# About the "pipeline environment setup"

For my case I am trying to expose some pipeline variables by XCOM that tasks 
could get this when running, so that I want to this "expose some pipeline 
variables" could be done only once in pipeline level.

A task could do it but something like this is better built-in into the pipeline.

发件人: Jiening Wen <jiening...@optiver.com>
发送时间: 2018年5月14日 11:13
收件人: dev@airflow.incubator.apache.org
主题: RE: 答复: How to know the DAG is starting to run

I would question that hooking into DAG.run is "more gracefully" than having a 
root task node that does the pipeline environment setup.
IMO it'd be easier and cleaner to catch setup errors when it's done in a 
separate task.

Best regards,
Jiening

-----Original Message-
From: Song Liu [mailto:song...@outlook.com]
Sent: Saturday 12 May 2018 9:06 AM
To: dev@airflow.incubator.apache.org
Subject: 答复: 答复: How to know the DAG is starting to run [External]

Yes, I want to know the event about the creation of a DagRun.

发件人: crisp...@gmail.com <crisp...@gmail.com> 代表 Chris Palmer 
<ch...@crpalmer.com>
发送时间: 2018年5月11日 15:46
收件人: dev@airflow.incubator.apache.org
主题: Re: 答复: How to know the DAG is starting to run

It's not even clear to me what it means for a DAG to start running. The 
creation of a DagRun for a specific execution date is completely independent of 
the scheduling of any TaskInstances for that DagRun. There could be a 
significant delay between those two events, either deliberately encoded into 
the DAG or due to resource constraints.

What event are you actually interested in knowing about? The creation of a 
DagRun? The starting of any task for a DagRun? Something else?

Maybe if you provided more details on what exactly the "pipeline environment 
setup" you are trying to do, it would help others understand the problem you 
are trying to solve.

Chris

On Fri, May 11, 2018 at 10:59 AM, Song Liu <song...@outlook.com> wrote:

> Overriding the "DAG.run" sounds like a workaround, so that if it's
> running a first operation of DAG then do some setup etc.
>
> 
> 发件人: Victor Noagbodji <vnoagbo...@amplify-nation.com>
> 发送时间: 2018年5月11日 12:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Hey,
>
> I don't know if airflow has a concept of DAG-level events or callbacks.
> (Operators do have callbacks though.). You might get away with
> subclassing the DAG class or having a class decorator.
>
> The source suggests that ".run()" is the method you want to override.
> You may want to call the original "super().run()" then do what you
> need to do afterwards.
>
> Let's see if that works for you.
>
> > On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote:
> >
> > Yes, I have though this approach, but more elegant way is doing in
> > the
> DAG since we don't want to add this "pipeline environment setup" as a
> single operator, which should be done in the DAG more gracefully.
> > 
> > 发件人: James Meickle <jmeic...@quantopian.com>
> > 发送时间: 2018年5月11日 12:09
> > 收件人: dev@airflow.incubator.apache.org
> > 主题: Re: How to know the DAG is starting to run
> >
> > Song:
> >
> > You can put an operator as the very first node in the DAG, and have
> > everything else in the DAG depend on it. For example, this is the
> approach
> > we use to only execute DAG tasks on stock market trading days.
> >
> > -James M.
> >
> > On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:
> >
> >> Hi,
> >>
> >> I have something just want to be done only once when DAG is
> >> constructed, but it seems that DAG will be instanced every time
> >> when run each of operator.
> >>
> >> So is that there function in DAG that tell us it is starting to run
> >> now
> ?
> >>
> >> Thanks,
> >> Song
> >>
>
>


About how to let scheduler discover right now after a DAG added

2018-05-13 Thread Song Liu
Hi,


Since the scheduler discovering is timer based, but if I have added a new DAG 
into dags_folder, is that possible to call scheduler API to discover the 
dags_folder immediately (in my plugin) ?


In summary about the scheduler:


- when to discover ? besides the time interval, might support discover on 
demand.

- how to discover ? might need a incremental discovering mechanism, only 
discover updated or newly added dags.


Thanks,

Song


答复: About the DAG discovering not synced between scheduler and webserver

2018-05-13 Thread Song Liu
For example bellow two API from webserver, it is getting the dag out of one 
global dagbag object, which is instantiated when the app instance is created, 
so it (app/webserver) can't be aware of any new DAGs until this app is 
re-launched again ? What does this design for ?


```

dagbag = models.DagBag(settings.DAGS_FOLDER)



@expose('/run')
def run(self):
dag_id = request.args.get('dag_id')
dag = dagbag.get_dag(dag_id)


@expose('/trigger')
def trigger(self):
dag_id = request.args.get('dag_id')
dag = dagbag.get_dag(dag_id)
```




发件人: 刘松(Brain++组) <liuson...@megvii.com>
发送时间: 2018年5月13日 5:39:40
收件人: dev@airflow.incubator.apache.org
主题: Re: About the DAG discovering not synced between scheduler and webserver

Hi,


It seems that Airflow handles bellow situation currently:


-  DAGs discovered in scheduler, but not discovered by webserver yet

-  DAGs discovered in webserver, but not discovered by scheduler yet


I still don't quite understand why there is the discovering logic separately in 
scheduler and webserver, based on my understanding webserver only needs to 
display the orm_dags from metadb, is there any requirement or design 
consideration besides this ?


Many thanks for any information.


Thanks,

Song


From: Song Liu <song...@outlook.com>
Sent: Saturday, May 12, 2018 7:58:43 PM
To: dev@airflow.incubator.apache.org
Subject: About the DAG discovering not synced between scheduler and webserver

Hi,

When add a new dag, sometimes we can see:

```
This DAG isn't available in the web server's DagBag object. It shows up in this 
list because the scheduler marked it as active in the metadata database.
```

In the views.py, it will collect DAGs under "DAGS_FOLDER" by instantiate a 
DagBag object as bellow:

```
dagbag = models.DagBag(settings.DAGS_FOLDER)
```

So that webserver will depends on its own timing to collect DAGs, but why not 
just simply to query metadata db ? since if a DAG is active in DB now it can be 
visible in web at the time.

Could someone share something behind this design ?

Thanks,
Song


About the DAG discovering not synced between scheduler and webserver

2018-05-12 Thread Song Liu
Hi,

When add a new dag, sometimes we can see:

```
This DAG isn't available in the web server's DagBag object. It shows up in this 
list because the scheduler marked it as active in the metadata database.
```

In the views.py, it will collect DAGs under "DAGS_FOLDER" by instantiate a 
DagBag object as bellow:

```
dagbag = models.DagBag(settings.DAGS_FOLDER)
```

So that webserver will depends on its own timing to collect DAGs, but why not 
just simply to query metadata db ? since if a DAG is active in DB now it can be 
visible in web at the time.

Could someone share something behind this design ?

Thanks,
Song


答复: 答复: How to know the DAG is starting to run

2018-05-12 Thread Song Liu
Yes, I want to know the event about the creation of a DagRun.

发件人: crisp...@gmail.com <crisp...@gmail.com> 代表 Chris Palmer 
<ch...@crpalmer.com>
发送时间: 2018年5月11日 15:46
收件人: dev@airflow.incubator.apache.org
主题: Re: 答复: How to know the DAG is starting to run

It's not even clear to me what it means for a DAG to start running. The
creation of a DagRun for a specific execution date is completely
independent of the scheduling of any TaskInstances for that DagRun. There
could be a significant delay between those two events, either deliberately
encoded into the DAG or due to resource constraints.

What event are you actually interested in knowing about? The creation of a
DagRun? The starting of any task for a DagRun? Something else?

Maybe if you provided more details on what exactly the "pipeline
environment setup" you are trying to do, it would help others understand
the problem you are trying to solve.

Chris

On Fri, May 11, 2018 at 10:59 AM, Song Liu <song...@outlook.com> wrote:

> Overriding the "DAG.run" sounds like a workaround, so that if it's running
> a first operation of DAG then do some setup etc.
>
> 
> 发件人: Victor Noagbodji <vnoagbo...@amplify-nation.com>
> 发送时间: 2018年5月11日 12:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Hey,
>
> I don't know if airflow has a concept of DAG-level events or callbacks.
> (Operators do have callbacks though.). You might get away with subclassing
> the DAG class or having a class decorator.
>
> The source suggests that ".run()" is the method you want to override. You
> may want to call the original "super().run()" then do what you need to do
> afterwards.
>
> Let's see if that works for you.
>
> > On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote:
> >
> > Yes, I have though this approach, but more elegant way is doing in the
> DAG since we don't want to add this "pipeline environment setup" as a
> single operator, which should be done in the DAG more gracefully.
> > 
> > 发件人: James Meickle <jmeic...@quantopian.com>
> > 发送时间: 2018年5月11日 12:09
> > 收件人: dev@airflow.incubator.apache.org
> > 主题: Re: How to know the DAG is starting to run
> >
> > Song:
> >
> > You can put an operator as the very first node in the DAG, and have
> > everything else in the DAG depend on it. For example, this is the
> approach
> > we use to only execute DAG tasks on stock market trading days.
> >
> > -James M.
> >
> > On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:
> >
> >> Hi,
> >>
> >> I have something just want to be done only once when DAG is constructed,
> >> but it seems that DAG will be instanced every time when run each of
> >> operator.
> >>
> >> So is that there function in DAG that tell us it is starting to run now
> ?
> >>
> >> Thanks,
> >> Song
> >>
>
>


答复: How to know the DAG is starting to run

2018-05-11 Thread Song Liu
Overriding the "DAG.run" sounds like a workaround, so that if it's running a 
first operation of DAG then do some setup etc.


发件人: Victor Noagbodji <vnoagbo...@amplify-nation.com>
发送时间: 2018年5月11日 12:50
收件人: dev@airflow.incubator.apache.org
主题: Re: How to know the DAG is starting to run

Hey,

I don't know if airflow has a concept of DAG-level events or callbacks. 
(Operators do have callbacks though.). You might get away with subclassing the 
DAG class or having a class decorator.

The source suggests that ".run()" is the method you want to override. You may 
want to call the original "super().run()" then do what you need to do 
afterwards.

Let's see if that works for you.

> On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote:
>
> Yes, I have though this approach, but more elegant way is doing in the DAG 
> since we don't want to add this "pipeline environment setup" as a single 
> operator, which should be done in the DAG more gracefully.
> 
> 发件人: James Meickle <jmeic...@quantopian.com>
> 发送时间: 2018年5月11日 12:09
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Song:
>
> You can put an operator as the very first node in the DAG, and have
> everything else in the DAG depend on it. For example, this is the approach
> we use to only execute DAG tasks on stock market trading days.
>
> -James M.
>
> On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:
>
>> Hi,
>>
>> I have something just want to be done only once when DAG is constructed,
>> but it seems that DAG will be instanced every time when run each of
>> operator.
>>
>> So is that there function in DAG that tell us it is starting to run now ?
>>
>> Thanks,
>> Song
>>



答复: Airflow REST API proof of concept.

2018-05-11 Thread Song Liu
So that this Java REST API server is talking to the meta db directly ?

发件人: Luke Diment 
发送时间: 2018年5月11日 12:22
收件人: dev@airflow.incubator.apache.org
主题: Fwd: Airflow REST API proof of concept.

FYI.

Sent from my iPhone

Begin forwarded message:

From: Luke Diment >
Date: 11 May 2018 at 1:02:43 PM NZST
To: 
"dev-ow...@airflow.incubator.apache.org"
 
>
Subject: Fw: Airflow REST API proof of concept.


FYI.


From: Luke Diment
Sent: Thursday, May 10, 2018 4:33 PM
To: 
dev-subscr...@airflow.incubator.apache.org
Subject: Airflow REST API proof of concept.


Hi Airflow contributors,


I am a Java developer/full stack and lots of other stuff at Westpac Bank New 
Zealand.


We currently use Airflow for task scheduling for a rather large integration 
project for financial risk assessment.


During our development phase we started to understand that a REST API in front 
of Airflow would be a great idea.


We realise that you guys have detailed there will a REST API at some stage.


We have already built a proof of concept REST API implementation in Java (of 
course...;-))...


We were wondering if your contributor group would find this helpful or if there 
would be any reason to continue such an API in Java.


We look forward to your response.  We can share the code if needed...


Thanks,


Luke Diment.





The contents of this email and any attachments are confidential and may be 
legally privileged. If you are not the intended recipient please advise the 
sender immediately and delete the email and attachments. Any use, 
dissemination, reproduction or distribution of this email and any attachments 
by anyone other than the intended recipient is prohibited.


答复: How to know the DAG is starting to run

2018-05-11 Thread Song Liu
Yes, I have though this approach, but more elegant way is doing in the DAG 
since we don't want to add this "pipeline environment setup" as a single 
operator, which should be done in the DAG more gracefully.

发件人: James Meickle <jmeic...@quantopian.com>
发送时间: 2018年5月11日 12:09
收件人: dev@airflow.incubator.apache.org
主题: Re: How to know the DAG is starting to run

Song:

You can put an operator as the very first node in the DAG, and have
everything else in the DAG depend on it. For example, this is the approach
we use to only execute DAG tasks on stock market trading days.

-James M.

On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:

> Hi,
>
> I have something just want to be done only once when DAG is constructed,
> but it seems that DAG will be instanced every time when run each of
> operator.
>
> So is that there function in DAG that tell us it is starting to run now ?
>
> Thanks,
> Song
>


答复: Define folder for task of dag

2018-05-11 Thread Song Liu
It seems that this temporary folder name can't be got.

The folder you saw is a TemporaryFile created to hold your bash command and run 
it by BashOperator, I think you could have your own working space and run your 
logic there by simply "cd {your_work_space}; do something".

发件人: Anton Mushin 
发送时间: 2018年5月11日 11:20
收件人: dev@airflow.incubator.apache.org
主题: Define folder for task of dag

Hi everyone,
I need know folder for task of dag.
for example
I have two tasks in dag:
pwd1 = BashOperator(
task_id='pwd1',
bash_command='pwd',
dag=dag)

pwd2 = BashOperator(
task_id='pwd2',
bash_command='pwd',
dag=dag)

as result I have for pwd1:
{bash_operator.py:97} INFO - Output:
{bash_operator.py:101} INFO - /tmp/airflowtmp3u5tdpt_

for pwd2:
{bash_operator.py:97} INFO - Output:
{bash_operator.py:101} INFO - /tmp/airflowtmphiyryxno

Can I get folder name where will be execute dag task?
in my case,  before run tasks getting /tmp/airflowtmp3u5tdpt_ and  
/tmp/airflowtmphiyryxno

Best Regards,
Anton


How to know the DAG is starting to run

2018-05-11 Thread Song Liu
Hi,

I have something just want to be done only once when DAG is constructed, but it 
seems that DAG will be instanced every time when run each of operator.

So is that there function in DAG that tell us it is starting to run now ?

Thanks,
Song


答复: About the difference between xcom and variable

2018-05-10 Thread Song Liu
It seems that variable doesn't remove the duplication if have.

Anyway XCom.get / XCom.set looks like more flexible and it meets the 
requirement.

Thanks,
Song

发件人: Song Liu <song...@outlook.com>
发送时间: 2018年5月9日 7:22
收件人: dev@airflow.incubator.apache.org
主题: About the difference between xcom and variable

Hi,

For the cross-task communication, there are two options:
- xcom
- variable

if I have make sure the key of variable is global unique (such uuid), is there 
any other limitations or cons for using variable instead of xcom ? since that 
xcom should specify the task_id which is used not very convenient compared with 
variable.

Any information is appreciated.

Thanks,
Song


Interesting things about how to know it's a DAG file

2018-05-10 Thread Song Liu
Hi,

I just create a custom Dag class naming such as "MyPipeline" by extending the 
"DAG" class, but Airflow is failed to identify this is a DAG file.

After digging into the Airflow implementation around the dag_processing.py file:

```
# Heuristic that guesses whether a Python file contains an # Airflow DAG 
definition. might_contain_dag = True if safe_mode and not 
zipfile.is_zipfile(file_path): with open(file_path, 'rb') as f: content = 
f.read() might_contain_dag = all( [s in content for s in (b'DAG', b'airflow')])
```

So if the keyword "DAG" and "airflow" contained, it is a DAG file.

I don't know is there any other be more scientific way for this ?

Thanks,
Song


About the difference between xcom and variable

2018-05-09 Thread Song Liu
Hi,

For the cross-task communication, there are two options:
- xcom
- variable

if I have make sure the key of variable is global unique (such uuid), is there 
any other limitations or cons for using variable instead of xcom ? since that 
xcom should specify the task_id which is used not very convenient compared with 
variable.

Any information is appreciated.

Thanks,
Song


About how to pause the running task

2018-04-26 Thread Song Liu
Hi,

A DAG is composed of many tasks, when this DAG is started, how to pause the 
current running task ?

Thanks,
Song


About the project support in Airflow

2018-04-24 Thread Song Liu
Hi,

Basically the DAGs are created for a project purpose, so if I have many 
different projects, will the Airflow support the Project concept and organize 
them separately ?

Is this a known requirement or any plan for this already ?

Thanks,
Song


答复: 答复: About multi user support in airflow

2018-02-27 Thread Song Liu
Hi Chris & Joy,

About the access control feature in new web-server, is it kind of a filter 
mechanism based on the rights ? will the dag storage be separated per user also 
?

Also for the multi-user support, how to isolate the execution space for 
different user ? for example, if one user execute one bash to delete other 
user's data.

Or from the design perspective, is the airflow aiming to serve multi-user ?

Thanks,
Song

发件人: Chris Riccomini <criccom...@apache.org>
发送时间: 2018年2月2日 15:28
收件人: dev@airflow.incubator.apache.org
抄送: Trent Robbins
主题: Re: 答复: About multi user support in airflow

@song, you might also be interested in the work that Joy Gao is doing to
add more access controls to the UI:

https://github.com/wepay/airflow-webserver

On Thu, Feb 1, 2018 at 8:12 PM, Song Liu <song...@outlook.com> wrote:

> Hi Trent,
>
> One way is to deploy one standalone airflow components (scheduler, worker,
> broker, mysql etc.) in docker for every single user, but there needs to
> only one web service for user accessing, that would mean the web service
> should be interact with multiple other worker / scheduler for individual
> user, do I understand correctly ?
>
> For Flask, how could the multi-user feature be supported ? could you help
> clarify ?
>
> Also, could you help share what's the reason/design behind why it doesn't
> support multi-user feature in airflow ?
>
> Thanks,
> Song
> 
> 发件人: Trent Robbins <robbi...@gmail.com>
> 发送时间: 2018年2月2日 3:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: About multi user support in airflow
>
> Hi Song,
>
> I would recommend using airflow in docker with individual databases for
> individual contributors. Does this make sense for your needs? There are
> other similar concepts that could also be used.
>
> I don't think there is another way to split DAGs by user permission.
> However, this feature doesn't seem particularly challenging to implement
> since it uses Flask. But who knows!
>
> Trent
>
> On Thu, Feb 1, 2018 at 19:24 Song Liu <song...@outlook.com> wrote:
>
> > Hi,
> >
> > In production, multi-user support is needed, which means that every user
> > could login the airflow platform, and could manage their own dags
> > separately.
> >
> > But currently it seem that all the dags are managed into a single dags
> > folder, so for multi-user support what I need to do ?
> >
> > Many thanks for helping out.
> >
> > Thanks,
> > Song
> >
> --
> (Sent from cellphone)
>


答复: About multi user support in airflow

2018-02-01 Thread Song Liu
Hi Trent,

Thanks for your sharing.

Multi-tenancy

You can filter the list of dags in webserver by owner name when authentication 
is turned on by setting webserver:filter_by_owner in your config. With this, a 
user will see only the dags which it is owner of, unless it is a superuser.

[webserver]
filter_by_owner = True


It seems that it could be achieved by the filter_by_owner, I would try is this 
usable already or not.

But here I still have another question about the dags management:

1- for the cluster deployment, what's the better way to sync among the 
scheduler and other workers ? and why not build-in this sync mechanism ?
2- one real case that when I add a new dag into dags folder, I need refresh 
many times via. web UI to see it and it will appear the notice about "scheduler 
seems doesn't see this dag locally", I want to know when a new dag added, how 
does the scheduler will do ? will it parse the new added dag only ?

Thanks again.

Thanks,
Song

发件人: Trent Robbins <robbi...@gmail.com>
发送时间: 2018年2月2日 4:19
收件人: Song Liu
抄送: dev@airflow.incubator.apache.org
主题: Re: About multi user support in airflow

Hi Song,

It looks like there is a basic "owner" DAG filter feature already available but 
I do not know how detailed the implementation is: 
https://airflow.apache.org/security.html

I can't speak for the developers but I assume that a fully built out feature 
hasn't been an extremely high priority for anyone or perhaps there is more code 
support than I know about and it isn't fully documented.

From a quick scan of the project I see that flask-login is installed during 
setup although that doesn't necessarily mean that it plays a role in these 
features.

Best,
Trent

On Thu, Feb 1, 2018 at 20:12 Song Liu 
<song...@outlook.com<mailto:song...@outlook.com>> wrote:
Hi Trent,

One way is to deploy one standalone airflow components (scheduler, worker, 
broker, mysql etc.) in docker for every single user, but there needs to only 
one web service for user accessing, that would mean the web service should be 
interact with multiple other worker / scheduler for individual user, do I 
understand correctly ?

For Flask, how could the multi-user feature be supported ? could you help 
clarify ?

Also, could you help share what's the reason/design behind why it doesn't 
support multi-user feature in airflow ?

Thanks,
Song

发件人: Trent Robbins <robbi...@gmail.com<mailto:robbi...@gmail.com>>
发送时间: 2018年2月2日 3:50
收件人: dev@airflow.incubator.apache.org<mailto:dev@airflow.incubator.apache.org>
主题: Re: About multi user support in airflow

Hi Song,

I would recommend using airflow in docker with individual databases for
individual contributors. Does this make sense for your needs? There are
other similar concepts that could also be used.

I don't think there is another way to split DAGs by user permission.
However, this feature doesn't seem particularly challenging to implement
since it uses Flask. But who knows!

Trent

On Thu, Feb 1, 2018 at 19:24 Song Liu 
<song...@outlook.com<mailto:song...@outlook.com>> wrote:

> Hi,
>
> In production, multi-user support is needed, which means that every user
> could login the airflow platform, and could manage their own dags
> separately.
>
> But currently it seem that all the dags are managed into a single dags
> folder, so for multi-user support what I need to do ?
>
> Many thanks for helping out.
>
> Thanks,
> Song
>
--
(Sent from cellphone)
--
(Sent from cellphone)


About multi user support in airflow

2018-02-01 Thread Song Liu
Hi,

In production, multi-user support is needed, which means that every user could 
login the airflow platform, and could manage their own dags separately.

But currently it seem that all the dags are managed into a single dags folder, 
so for multi-user support what I need to do ?

Many thanks for helping out.

Thanks,
Song


How to trigger the dag to run immediately

2018-02-01 Thread Song Liu
Hi,

It seems that airflow scheduler is based on the start_date and interval, but is 
that possible to be triggered only by user request on demand immediately ? such 
as when user click a button "Run" then this dag will be started right now and 
be stopped when finished.

Thanks for any information!

Thanks,
Song