Brian King created AIRFLOW-4958:
-----------------------------------

             Summary: Documentation issue with cron schedule_interval
                 Key: AIRFLOW-4958
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4958
             Project: Apache Airflow
          Issue Type: Bug
          Components: documentation, scheduler
    Affects Versions: 1.10.3
            Reporter: Brian King


The docs regarding scheduling with a cron expression ( 
[https://airflow.apache.org/scheduler.html#dag-runs] ) links to a wikipedia 
article on cron ( [https://en.wikipedia.org/wiki/Cron#CRON_expression] ), which 
says that the expression is comprised of 5 or 6 fields, with the last field 
being the year.

However, croniter, which is used by Airflow, treats the 6th field as seconds ( 
[https://github.com/taichino/croniter/issues/76#issuecomment-332508039] ).

 

Perhaps the link to cron documentation should link to the croniter 
documentation instead of the Wikipedia article, or the Airflow documentation 
should make clear that the 6th field is optional, and if used, it represents 
seconds.

I had a quick-running job (that finished in less than a minute) that would 
execute twice a day.  The DAG was configured with schedule_interval = "10 2 * * 
* *" and max_active_runs = 1.  The first job run started around 2:10:00, the 
second one started around 2:10:45.  Removing the sixth field resulted in the 
job only running once per day, as I wanted it to do.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to