mattinbits opened a new issue #21171: URL: https://github.com/apache/airflow/issues/21171
### Apache Airflow version 2.2.2 ### What happened Airflow schedules a task an hour earlier than expected, when using an MSSQL metadata database where the DB server is set to the CET timezone. The screenshot below shows the DAG starting an hour before the end of the data interval. <img width="220" alt="image" src="https://user-images.githubusercontent.com/3765307/151439921-43b2ac5b-3e93-4337-a311-82e694ad3084.png"> ### What you expected to happen Airflow schedules the task at the correct time in UTC. ### How to reproduce It's hard to describe a complete reproducible method since it relies on having an MSSQL Server with particular settings. A relevant DAG would be a simple as: ``` with DAG( dag_id="example_dag", start_date=datetime(2021, 1, 1), schedule_interval="0 9 * * 1-5", ) as dag: task = DummyOperator(task_id="dummy") ``` And Airflow config of: ``` default_timezone = utc ``` This DAG would then be scheduled an hour earlier than expected. ### Operating System Redhat UBI 8 ### Versions of Apache Airflow Providers _No response_ ### Deployment Other Docker-based deployment ### Deployment details Airflow scheduler and webserver each running in a docker container based on Redhat UBI 8. Metadata DB is MSSQL Server running on a Windows Server where the server timezone is CET. ### Anything else In our installation, the problem is happening for any DAG with a UTC based schedule. I believe the root cause is this line of code: https://github.com/apache/airflow/blob/6405d8f804e7cbd1748aa7eed65f2bbf0fcf022e/airflow/models/dag.py#L2872 On MSSQL, `func.now()` appears to correspond to `GETDATE()`, which returns the current time in the timezone of the DB server. But `next_dagrun_create_after` is stored in the database as UTC (in a `datetime2` column, which doesn't include timezone information). So this line of code is equivalent to "Is the current time in CET before the next creation time in UTC?", meaning that a DAG that should start at 09:00 UTC starts at 09:00 CET instead, one hour early. I can verify that `func.now()` returns CET with the SQLAlchemy code `engine.execute(sa.select([sa.func.now()])).fetchall()`. I think the correct way to get the current time in UTC on MSSQL is `GETUTCDATE()`. We ran Airflow 1.10 previously without seeing this problem. From what I can tell, in that version the date comparison is done on the application side rather than in the DB. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
