[
https://issues.apache.org/jira/browse/AIRFLOW-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe Schmid updated AIRFLOW-1011:
--------------------------------
Attachment: test_subdag.py
> Task Instance Results not stored for SubDAG Tasks
> -------------------------------------------------
>
> Key: AIRFLOW-1011
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1011
> Project: Apache Airflow
> Issue Type: Bug
> Components: backfill, subdag
> Affects Versions: Airflow 1.8
> Reporter: Joe Schmid
> Priority: Critical
> Attachments: 1-TopLevelDAGTaskInstancesShownCorrectly.png,
> 2-ZoomedSubDAG-NoTaskInstances-v1.8.png,
> 3-ZoomedSubDAG-TaskInstances-v1.7.1.3.png, test_subdag.py
>
>
> In previous Airflow versions, results for tasks executed as a subdag were
> written as rows to task_instances. In Airflow 1.8 only rows for tasks inside
> the top-level DAG (non-subdag tasks) seem to get written to the database.
> This results in being unable to check the status of task instances inside the
> subdag from the UI, check the logs for those task instances from the UI, etc.
> Here is a simple test DAG that exhibits the issue:
> ------------------------------------------------------------------------
> from airflow.operators.dummy_operator import DummyOperator
> from airflow.operators.subdag_operator import SubDagOperator
> from airflow.models import DAG
> from datetime import datetime, timedelta
> args = {
> 'owner': 'airflow',
> 'start_date': datetime(2016, 3, 1),
> }
> DAG_NAME = 'Test_SubDAG'
> SUBDAG_OP = 'SubDagOp'
> def get_test_subdag():
> subdag = DAG(
> dag_id='{}.{}'.format(DAG_NAME, SUBDAG_OP), default_args=args,
> schedule_interval="@daily") # This is ignored, but it can't be None
> or @once
> first = DummyOperator(
> task_id='SubDAG_Task1',
> dag=subdag
> )
> last = DummyOperator(
> task_id='SubDAG_Task2',
> dag=subdag
> )
> first >> last
> return subdag
> dag = DAG(
> dag_id=DAG_NAME, default_args=args,
> schedule_interval=None,
> dagrun_timeout=timedelta(hours=1))
> run_first = DummyOperator(
> task_id='DAG_Task1',
> dag=dag
> )
> run_subdag = SubDagOperator(
> subdag=get_test_subdag(),
> task_id=SUBDAG_OP,
> dag=dag
> )
> run_last = DummyOperator(
> task_id='DAG_Task2',
> dag=dag
> )
> run_first >> run_subdag
> run_subdag >> run_last
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)