Hey Paul,

I believe what you're asking for is subdag-worker affinity. As far as I
know, this hasn't been implemented.

You might be able to use pools, and assign only a single worker to the
pool, but I haven't tried this. This also runs the risk of limiting your
throughput, since the pool-worker mapping is always just to a single worker
(to prevent subdags from getting load balanced on to other workers).

Cheers,
Chris

On Thu, May 19, 2016 at 8:07 AM, Ryabchuk, Pavlo <
[email protected]> wrote:

> Hello,
>
> I am trying to fully use Airflow analytics and trying to make my tasks as
> granular as possible still having the benefit of CeleryExecutor present.
> In general I want to make my DAG to consist of 100+ Subgags which are
> actually distributed by Celery, but I want to have subdag tasks all
> executed on same worker instance.
> SubDAG is in general this: copyData data from S3 to instance -> run
> calculation -> copy result to S3. The reason I want to split it into 3
> tasks is to have ability to measure pure calculation time and aply SLA to
> it and also get better statisticts on copy operations.
> So the main question is how to execute tasks of SubDAG one after another
> on same instance?
>
> I've came across this issue/workaround here
> https://issues.apache.org/jira/browse/AIRFLOW-74, but I believe it won't
> solve my issue.
> If it is not supported and I am not missing some magic configuration :)
> but still could be implemented with relatively small effort - I am in :)
>
> Best,
> Paul
>

Reply via email to