[ 
https://issues.apache.org/jira/browse/AIRFLOW-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hamid Mahmood reassigned AIRFLOW-6551:
--------------------------------------

    Assignee: Alex Lumpov

> KubernetesExecutor does not create dynamic pods for tasks inside subdag
> -----------------------------------------------------------------------
>
>                 Key: AIRFLOW-6551
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6551
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: executor-kubernetes
>    Affects Versions: 1.10.7
>            Reporter: Hamid Mahmood
>            Assignee: Alex Lumpov
>            Priority: Major
>
> KubernetesExecutor does not create dynamic pods for tasks inside subdag.
> I am running airflow 1.10.7 on eks 1.14. I have multiple subdags operators 
> inside a main_dag. Following is the hierarchy
>  * main_dag:
>  ** subdagA
>  *** taskA1
>  *** taskA2
>  *** taskA3
>  ** subdagB
>  *** subdagB_1
>  **** taskB1
>  **** taskB2
>  ** task_main1
>  ** task_main2
>  ** task_main3
> I have tested the following three scenarios.
> *Scenario:1*
> I have set the following parameter
> {code:java}
> AIRFLOW__CORE__EXECUTOR = KubernetesExecutor
> {code}
> When I run the workflow main_dag, only a single pod is created with name 
> subdagA-
> 309c4c564b9841529236a31dfaf135c5 and all the tasks (tasksA1,taskA2,taskA3) 
> run inside that single pod.
> In theory there should be 3 separate pods for these 3 tasks inside subdagA. 
> Same is the case for subdagB, only a single pod is created to run subdagB, 
> subdagB_1 and its tasks runs inside that pod. 
> But the tasks (task_main1, task_main2, task_main3) that are not inside 
> further subdag runs in dynamic pods.
>  
> *Scenario: 2*
> I have set the following
> {code:java}
> AIRFLOW__CORE__EXECUTOR = KubernetesExecutor  
> {code}
> and passed executor=KubernetesExecutor() when creating the subdag. 
> {code:java}
> SubDagOperator( task_id=task_id, subdag=subdag, dag=parent_dag, 
> executor=KubernetesExecutor(), **kwargs )
> {code}
> Still one pod is created for SubtaskA but now the taskA1 inside subdagA got 
> stuck in queue state, when I checked the logs of this pod I got the following
> {code:java}
> Running %s on host %s <TaskInstance: main_dag.subdagA 
> 2020-01-12T02:15:00+00:00 [queued]> 
> maindagsubdagA-1a31ef3e2aef489daf1b329160646{code}
> *Scenario: 3*
> I have set the following
> {code:java}
> AIRFLOW__CORE__EXECUTOR = LocalExecutor  
> {code}
> and passed executor=KubernetesExecutor() when creating the subdag.  
> {code:java}
> SubDagOperator( task_id=task_id, subdag=subdag, dag=parent_dag, 
> executor=KubernetesExecutor(), **kwargs )
> {code}
> Now this time dynamic pods are created for single task inside subdagA and 
> subdagB and I get the parallelism. In this case KubernetesExecutor shows the 
> required behavior. But the downside of this is that these three tasks 
> task_main1, task_main2 and task_main3 will use LocalExecutor and will run in 
> scheduler pod.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to