I play a bit with the Airflow (v2.6.1) based on some setting such as
catchup, start_date, end_date. I find I can't achieve the effect I am
after. So here is my question.

Scenario
I want to schedule a dag for a period of time in the past to run on
the dates at a specific schedule value in the future. For instance, I
want to backfill the data between start date 2023-01-01 00:00:00 and
end date 2023-01-05 22:00:00. However, I also need to trigger the dag
to run at a specific time frame like every day 22-23 and the next
day's 0-2. All date and timestamp is in UTC.

My attempt (the code may not be correct because I do not have the
source at hand)
args={
    start_date=datetime(2023,1,1,0,0,0)
    end_date=datetime(2023,1,5,22,0,0)
    ...
}

dag = DAG('my_dag',
    default_args=args
    catchup=True // I also tested with False
    schedule="*/10 22-23,0-2 * * *"
    ...
)

my_task(dag) >> another_task(dag)

The problem I encountered
When setting catchup=False, the dag won't run. However, setting
catchup=True will cause the dag to run immediately, which is the
effect I want to avoid. In fact, I want the dag to run during specific
time frame (22 ~ 23 and the next day 0-2 am per 10 mins) everyday
after my dag is deployed to Airflow server.

 In such case how should I configure the dag so that it will achieve
the effect I am looking for? Please let me know if my explanation is
not clear. I appreciate any suggestions, and advice.

Many thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@airflow.apache.org
For additional commands, e-mail: users-h...@airflow.apache.org

Reply via email to