[ 
https://issues.apache.org/jira/browse/AIRFLOW-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Schmid updated AIRFLOW-1011:
--------------------------------
    Attachment: test_subdag.py

> Task Instance Results not stored for SubDAG Tasks
> -------------------------------------------------
>
>                 Key: AIRFLOW-1011
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1011
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: backfill, subdag
>    Affects Versions: Airflow 1.8
>            Reporter: Joe Schmid
>            Priority: Critical
>         Attachments: 1-TopLevelDAGTaskInstancesShownCorrectly.png, 
> 2-ZoomedSubDAG-NoTaskInstances-v1.8.png, 
> 3-ZoomedSubDAG-TaskInstances-v1.7.1.3.png, test_subdag.py
>
>
> In previous Airflow versions, results for tasks executed as a subdag were 
> written as rows to task_instances. In Airflow 1.8 only rows for tasks inside 
> the top-level DAG (non-subdag tasks) seem to get written to the database.
> This results in being unable to check the status of task instances inside the 
> subdag from the UI, check the logs for those task instances from the UI, etc.
> Here is a simple test DAG that exhibits the issue:
> ------------------------------------------------------------------------
> from airflow.operators.dummy_operator import DummyOperator
> from airflow.operators.subdag_operator import SubDagOperator
> from airflow.models import DAG
> from datetime import datetime, timedelta
> args = {
>     'owner': 'airflow',
>     'start_date': datetime(2016, 3, 1),
> }
> DAG_NAME = 'Test_SubDAG'
> SUBDAG_OP = 'SubDagOp'
> def get_test_subdag():
>     subdag = DAG(
>         dag_id='{}.{}'.format(DAG_NAME, SUBDAG_OP), default_args=args,
>         schedule_interval="@daily")  # This is ignored, but it can't be None 
> or @once
>     first = DummyOperator(
>         task_id='SubDAG_Task1',
>         dag=subdag
>     )
>     last = DummyOperator(
>         task_id='SubDAG_Task2',
>         dag=subdag
>     )
>     first >> last
>     return subdag
> dag = DAG(
>     dag_id=DAG_NAME, default_args=args,
>     schedule_interval=None,
>     dagrun_timeout=timedelta(hours=1))
> run_first = DummyOperator(
>     task_id='DAG_Task1',
>     dag=dag
> )
> run_subdag = SubDagOperator(
>     subdag=get_test_subdag(),
>     task_id=SUBDAG_OP,
>     dag=dag
> )
> run_last = DummyOperator(
>     task_id='DAG_Task2',
>     dag=dag
> )
> run_first >> run_subdag
> run_subdag >> run_last



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to