millin opened a new pull request, #38139:
URL: https://github.com/apache/airflow/pull/38139
When the `start_date` of a DAG is specified with a fixed timezone, its
serialisation will not be correct and this causes the scheduler to crash.
**Example of an entry in the `data` column of the table `serialized_dag`**
before fix:
```json
{
...
"start_date": 1684684800.0,
"timezone": "FixedTimezone(28800, name=\"+08:00\")",
...
}
```
after fix:
```json
{
...
"start_date": 1684684800.0,
"timezone": 28800,
...
}
```
<details>
<summary>Crash log</summary>
```python
Traceback (most recent call last):
--
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/scheduler_command.py",
line 52, in _run_scheduler_job
run_job(job=job_runner.job, execute_callable=job_runner._execute)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py",
line 79, in wrapper
return func(*args, session=session, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/job.py",
line 393, in run_job
return execute_job(job, execute_callable=execute_callable)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/job.py",
line 422, in execute_job
ret = execute_callable()
^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/scheduler_job_runner.py",
line 855, in _execute
self._run_scheduler_loop()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/scheduler_job_runner.py",
line 987, in _run_scheduler_loop
num_queued_tis = self._do_scheduling(session)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/scheduler_job_runner.py",
line 1061, in _do_scheduling
self._create_dagruns_for_dags(guard, session)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/retries.py",
line 91, in wrapped_function
for attempt in run_with_db_retries(max_retries=retries, logger=logger,
**retry_kwargs):
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py",
line 347, in __iter__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py",
line 314, in iter
return fut.result()
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in
result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in
__get_result
raise self._exception
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/retries.py",
line 100, in wrapped_function
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/scheduler_job_runner.py",
line 1133, in _create_dagruns_for_dags
self._create_dag_runs(non_dataset_dags, session)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/jobs/scheduler_job_runner.py",
line 1167, in _create_dag_runs
dag = self.dagbag.get_dag(dag_model.dag_id, session=session)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py",
line 76, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/dagbag.py",
line 191, in get_dag
self._add_dag_from_db(dag_id=dag_id, session=session)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/dagbag.py",
line 273, in _add_dag_from_db
dag = row.dag
^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/serialized_dag.py",
line 231, in dag
return SerializedDAG.from_dict(data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/serialization/serialized_objects.py",
line 1443, in from_dict
return cls.deserialize_dag(serialized_obj["dag"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/serialization/serialized_objects.py",
line 1362, in deserialize_dag
v = cls._deserialize_timezone(v)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/timezone.py",
line 294, in parse_timezone
return pendulum.timezone(name) # type: ignore[operator]
^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/pendulum/__init__.py",
line 86, in timezone
return Timezone(name)
^^^^^^^^^^^^^^
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/pendulum/tz/timezone.py",
line 67, in __new__
raise InvalidTimezone(key)
pendulum.tz.exceptions.InvalidTimezone: FixedTimezone(28800, name="+08:00")
Traceback (most recent call last):
File "/usr/local/lib/python3.11/zoneinfo/_common.py", line 12, in load_tzdata
return resources.files(package_name).joinpath(resource_name).open("rb")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/pathlib.py", line 1044, in open
return io.open(self, mode, buffering, encoding, errors, newline)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory:
'/usr/local/airflow/.local/lib/python3.11/site-packages/tzdata/zoneinfo/FixedTimezone(28800,
name="+08:00")'
```
</details>
This issue introduced in Airflow 2.8.
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of an existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
<!-- Please keep an empty line above the dashes. -->
---
**^ Add meaningful description above**
Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
for more information.
In case of fundamental code changes, an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party
License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in a
newsfragment file, named `{pr_number}.significant.rst` or
`{issue_number}.significant.rst`, in
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]