bobjo-daangn opened a new issue, #30195:
URL: https://github.com/apache/airflow/issues/30195

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   Airflow version : v2.4.2   
   
   We have identical DAGs configured in the UK, Canada, South Korea, and Japan 
that perform the same behavior, with each DAG relying on the timezone of its 
respective country.  
   
   The DAGs are configured with a schedule interval of bi-hourly so that they 
can be executed once every two hours.
   - Each dag runs at an even number of hours in its timezone.
   
   #### example
   ```py
   BST = pendulum.timezone('Europe/London')  # +0/1  
   KST = pendulum.timezone('Asia/Seoul')  # +9  
   JST = pendulum.timezone('Asia/Tokyo')  # +9  
   EDT = pendulum.timezone('America/Toronto')  # -5/-4
   
   with DAG(  
       'test_dag_ca',  
       start_date=datetime.datetime(2023, 3, 11, tzinfo=EDT),  
       schedule_interval='0 */2 * * *',  
   ) as dag:
       latest_only = LatestOnlyOperator(
           task_id='latest_only',
       )
       ....
   ```
   When North American Daylight Saving Time started at 3/12,  we experienced an 
issue with our Canada DAG.
   It was scheduling normally until 2023-03-12 07:00 UTC and 2023-03-12 03:00 
EDT.
   
   Then, at 2023-03-12 08:00 UTC, two schedules started at the same time, but 
the lastest_only task, which is the earliest configured, skipped its schedule.
   
   #### 2023-03-12 07:00 UTC
   <img width="699" alt="image" 
src="https://user-images.githubusercontent.com/111758684/226301253-179ad526-f076-4cc7-8439-222f57e327be.png";>
   
   #### 2023-03-12 08:00 UTC (1)
   <img width="698" alt="image" 
src="https://user-images.githubusercontent.com/111758684/226301287-0e88d890-e1c8-4191-ae7f-7c0172c47421.png";>
   
   #### 2023-03-12 08:00 UTC (2)
   <img width="697" alt="image" 
src="https://user-images.githubusercontent.com/111758684/226301306-974c4fff-a673-4cda-b272-8b7df85cd8b5.png";>
   
   
   After that, scheduling did not start, and we only realized the problem when 
the DAGs that were sensing it threw a timeout.
   
   In the Web UI, we saw that Last Run and Next Run were the same value.
   <img width="697" alt="image" 
src="https://user-images.githubusercontent.com/111758684/226301424-148edbc2-3a2b-4972-a730-86fa62fc737a.png";>
  
   
   - I thought the problem was that the Next Run value was in the past than the 
Last Run, so I adjusted the values of the following columns in the dag_run 
table in metadb, and the scheduling went back.
        - next_dagrun 
        - next_dagrun_create_after 
        - next_dagrun_data_interval_start
        - next_dagrun_data_interval_end
   
   
   
   
   
   
   ### What you think should happen instead
   
   In hourly dag or daily dag, the behavior worked fine.
   I assumed that the bi-hourly dag would behave similarly.
   
   In fact, by the end of summer time, this was not the case.
   
   I thought it would adjust by either being scheduled an hour earlier or 
skipped once.
   
   ### How to reproduce
   
   Run below code and compare two DAGs (CA, KR)   
   The Canada dag will not be scheduled after 2023-03-12 07:00 UTC.
   ```py
   import datetime
   
   import pendulum
   
   from airflow import DAG
   from airflow.operators.empty import EmptyOperator
   
   EDT = pendulum.timezone('America/Toronto')  # -5/-4
   KST = pendulum.timezone('Asia/Seoul')  # +9
   
   
   class DagSpec:
       def __init__(
           self,
           dag_id: str,
           start_date: datetime.datetime,
           schedule_interval: str = '0 */2 * * *',
       ):
           self.dag_id = dag_id
           self.start_date = start_date
           self.schedule_interval = schedule_interval
   
   
   DAG_SPECS = [
       DagSpec(
           'test_summer_time_ca',
           datetime.datetime(2023, 3, 11, tzinfo=EDT),
       ),
       DagSpec(
           'test_summer_time_test_kr',
           datetime.datetime(2023, 3, 11, tzinfo=KST),
       ),
   ]
   
   for dag_spec in DAG_SPECS:
       with DAG(
           dag_spec.dag_id,
           start_date=dag_spec.start_date,
           schedule_interval=dag_spec.schedule_interval,
       ) as dag:
   
           start = EmptyOperator(task_id='start')
           end = EmptyOperator(task_id='end')
   
           start >> end
   
   ```
   
   ### Operating System
   
   Amazon Linux 2 5.4.228-131.415.amzn2.x86_64
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   This happened once at the beginning of daylight saving time.
   
   We did not experience the same issue at the end of daylight saving time.
   
   We expect to see the same issue when UK Daylight Saving Time starts in the 
near future.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to