Ash Berlin-Taylor created AIRFLOW-1837:
------------------------------------------

             Summary: Differing start_dates on tasks not respected by scheduler.
                 Key: AIRFLOW-1837
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1837
             Project: Apache Airflow
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Ash Berlin-Taylor


It it possible to specify start_date directly on tasks in dag, as well as on 
the DAG. This is correctly handled when creating dag runs, but it is seemingly 
ignored when scheduling tasks.

Given this example:

{code}
dag_args = {
    "start_date": datetime(2017, 9, 4),
}
dag = DAG(
    "my-dag",
    default_args=dag_args,
    schedule_interval="0 0 * * Mon",
)

# ...
with dag:
        op = PythonOperator(
            python_callable=fetcher.run,
            task_id="fetch_all_respondents",
            provide_context=True,
            # The "unfiltered" API calls are a lot quicker, so lets put them
            # ahead of any other filtered job in the queue.
            priority_weight=10,
            start_date=datetime(2014, 9, 1),
        )

        op = PythonOperator(
            python_callable=fetcher.run,
            task_id="fetch_by_demographics",
            op_kwargs={
                'demo_names': demo_names,
            },
            provide_context=True,
            priority_weight=5,
        )
{code}

I only want the fetch_all_respondents tasks to run for 2014..2017, and then 
from September 2017 I also want the fetch_by_demographics task to run. However 
right now both tasks are being scheduled from 2014-09-01.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to