Hey Paul, I believe what you're asking for is subdag-worker affinity. As far as I know, this hasn't been implemented.
You might be able to use pools, and assign only a single worker to the pool, but I haven't tried this. This also runs the risk of limiting your throughput, since the pool-worker mapping is always just to a single worker (to prevent subdags from getting load balanced on to other workers). Cheers, Chris On Thu, May 19, 2016 at 8:07 AM, Ryabchuk, Pavlo < [email protected]> wrote: > Hello, > > I am trying to fully use Airflow analytics and trying to make my tasks as > granular as possible still having the benefit of CeleryExecutor present. > In general I want to make my DAG to consist of 100+ Subgags which are > actually distributed by Celery, but I want to have subdag tasks all > executed on same worker instance. > SubDAG is in general this: copyData data from S3 to instance -> run > calculation -> copy result to S3. The reason I want to split it into 3 > tasks is to have ability to measure pure calculation time and aply SLA to > it and also get better statisticts on copy operations. > So the main question is how to execute tasks of SubDAG one after another > on same instance? > > I've came across this issue/workaround here > https://issues.apache.org/jira/browse/AIRFLOW-74, but I believe it won't > solve my issue. > If it is not supported and I am not missing some magic configuration :) > but still could be implemented with relatively small effort - I am in :) > > Best, > Paul >
