Pawel Bartoszek created AIRFLOW-4404:
----------------------------------------

             Summary: Improve support of cron-style jobs
                 Key: AIRFLOW-4404
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4404
             Project: Apache Airflow
          Issue Type: Improvement
          Components: scheduler
    Affects Versions: 1.10.3
            Reporter: Pawel Bartoszek


The cron like jobs are supported by Airflow with one downside: On the the very 
first job deployment (completely new DAG) an extra DAG run will be created for 
the latest passed period. 
When DAG is redeployed (dag name stays the same) then DB already contains the 
latest run and scheduler will work as a genuine cron scheduler. 
 
To better describe what I mean I prepared an example:
 
{code:java}
with DAG(
dag_id="dag",
start_date=datetime(2019, 4, 1),
schedule_interval="0 2 * * *",
default_view="graph",
orientation="TB",
concurrency=1,
max_active_runs=1,
catchup=False) as dag:{code}
 
I deploy 'dag' for the first time and system time is *2019-04-03 3 PM*.
Airflow will create a DAG run with execution date of 2019-04-02 2 AM straight 
after the deployment. However, when a new version of 'dag' is redeployed the 
next run will be triggered according to cron expression ie with the deployment 
done at 2019-04-03 6 PM the next dag run will be at 2019-04-04 2 AM.
 
*The requested change*
FroM cron-style jobs Airflow should work as a unix cron scheduler ie should 
start very first dag run only after current system time is after next cron 
expression date so that no extra run is created.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to