[ 
https://issues.apache.org/jira/browse/AIRFLOW-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7039:
-------------------------------------

    Assignee:     (was: Jarek Potiuk)

> Specific DAG Schedule & DST Results in Skipped DAG Run
> ------------------------------------------------------
>
>                 Key: AIRFLOW-7039
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-7039
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.7
>         Environment: Amazon Linux 2 AMI
>            Reporter: Peter Kim
>            Priority: Critical
>              Labels: timezone
>
> *Scenario:* 
> EC2 running airflow is in Eastern Time (America/New_York), 
> airflow.cfg>[core]>default_timezone=America/New_York (automatically changes 
> correctly)
> Monday morning after Daylight Savings Time applied a handful of DAG runs were 
> not executed as expected.  The strange part is that these DAGs were the only 
> jobs that did not behave as expected, all other DAGs ran normally.  
> Additionally, only the first expected run after DST was skipped, subsequent 
> runs later that day were scheduled successfully.
> Here is the pattern observed:
> DAG Schedule which skipped first run:  (0 <hour_a>,<hour_b> * * 1,2,3,4,5)
> e.g. Schedules M-F, with two distinct runs per day.
> DAGs that run at one time, M-F & DAGs that run at two times, not M-F did not 
> experience this issue.  
>  
> Based on the logs, it appears as if the expected run that was missed was not 
> seen by the scheduler whatsoever (see below):
>  
>  
> 2020 03 06 6:30 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 06:31:01,220] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,220] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=697
> [2020-03-06 06:31:01,222] \{scheduler_job.py:153} INFO - Started process 
> (PID=697) to work on /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 06:31:01,228] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py for tasks to queue
> [2020-03-06 06:31:01,228] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,228] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 06:31:01,238] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['<DAG_NAME>']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 06:31:01,305] \{scheduler_job.py:1262} INFO - Processing 
> <DAG_NAME>
> [2020-03-06 06:31:01,348] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'") result = 
> self._query(query)
> [2020-03-06 06:31:01,362] \{scheduler_job.py:1272} INFO - Created <DagRun 
> <DAG_NAME> @ 2020-03-05T15:30:00+00:00: scheduled__2020-03-05T15:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 06:31:01,366] \{scheduler_job.py:740} INFO - Examining DAG run 
> <DagRun <DAG_NAME> @ 2020-03-05 15:30:00+00:00: 
> scheduled__2020-03-05T15:30:00+00:00, externally triggered: False>
> [2020-03-06 06:31:01,389] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for <DAG: <DAG_NAME>> because no tasks in DAG have SLAs
> [2020-03-06 06:31:01,395] \{scheduler_job.py:1613} INFO - Creating / updating 
> <TaskInstance: <DAG_NAME>.<TASK_NAME> 2020-03-05 15:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 06:31:01,414] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py took 0.192 seconds
> 20200306 10 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 10:30:00,083] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,082] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=16194
> [2020-03-06 10:30:00,085] \{scheduler_job.py:153} INFO - Started process 
> (PID=16194) to work on /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 10:30:00,090] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py for tasks to queue
> [2020-03-06 10:30:00,090] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,090] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 10:30:00,099] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['<DAG_NAME>']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-06 10:30:00,159] \{scheduler_job.py:1262} INFO - Processing 
> <DAG_NAME>
> [2020-03-06 10:30:00,193] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'")
>   result = self._query(query)
> [2020-03-06 10:30:00,207] \{scheduler_job.py:1272} INFO - Created <DagRun 
> <DAG_NAME> @ 2020-03-06T11:30:00+00:00: scheduled__2020-03-06T11:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 10:30:00,212] \{scheduler_job.py:740} INFO - Examining DAG run 
> <DagRun <DAG_NAME> @ 2020-03-06 11:30:00+00:00: 
> scheduled__2020-03-06T11:30:00+00:00, externally triggered: False>
> [2020-03-06 10:30:00,232] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for <DAG: <DAG_NAME>> because no tasks in DAG have SLAs
> [2020-03-06 10:30:00,236] \{scheduler_job.py:1613} INFO - Creating / updating 
> <TaskInstance: <DAG_NAME>.<TASK_NAME> 2020-03-06 11:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 10:30:00,251] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py took 0.166 seconds
>  
> 2020 03 09 6:30 ET (AFTER DST, *UNEXPECTED BEHAVIOR*):
> [2020-03-09 06:31:55,240] \{logging_mixin.py:112} INFO - [2020-03-09 
> 06:31:55,240] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=23336
> [2020-03-09 06:31:55,242] \{scheduler_job.py:153} INFO - Started process 
> (PID=23336) to work on /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 06:31:55,247] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py for tasks to queue
> [2020-03-09 06:31:55,247] \{logging_mixin.py:112} INFO - [2020-03-09 
> 06:31:55,247] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 06:31:55,257] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['<DAG_NAME>']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 06:31:55,312] \{scheduler_job.py:1262} INFO - Processing 
> <DAG_NAME>
> [2020-03-09 06:31:55,338] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for <DAG: <DAG_NAME>> because no tasks in DAG have SLAs
> [2020-03-09 06:31:55,343] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py took 0.101 seconds
> 20200309 10 AM ET (AFTER DST, EXPECTED BEHAVIOR):
> [2020-03-09 10:30:54,487] \{logging_mixin.py:112} INFO - [2020-03-09 
> 10:30:54,487] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=6421
> [2020-03-09 10:30:54,490] \{scheduler_job.py:153} INFO - Started process 
> (PID=6421) to work on /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 10:30:54,498] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py for tasks to queue
> [2020-03-09 10:30:54,499] \{logging_mixin.py:112} INFO - [2020-03-09 
> 10:30:54,499] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 10:30:54,508] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['<DAG_NAME>']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py
> [2020-03-09 10:30:54,585] \{scheduler_job.py:1262} INFO - Processing 
> <DAG_NAME>
> [2020-03-09 10:30:54,621] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'")
>   result = self._query(query)
> [2020-03-09 10:30:54,634] \{scheduler_job.py:1272} INFO - Created <DagRun 
> <DAG_NAME> @ 2020-03-09T10:30:00+00:00: scheduled__2020-03-09T10:30:00+00:00, 
> externally triggered: False>
> [2020-03-09 10:30:54,638] \{scheduler_job.py:740} INFO - Examining DAG run 
> <DagRun <DAG_NAME> @ 2020-03-09 10:30:00+00:00: 
> scheduled__2020-03-09T10:30:00+00:00, externally triggered: False>
> [2020-03-09 10:30:54,660] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for <DAG: <DAG_NAME>> because no tasks in DAG have SLAs
> [2020-03-09 10:30:54,665] \{scheduler_job.py:1613} INFO - Creating / updating 
> <TaskInstance: <DAG_NAME>.<TASK_NAME> 2020-03-09 10:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-09 10:30:54,683] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/<DAG_NAME>.py took 0.193 seconds



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to