This discussion is more about the known problem of pendulum and how we
could deal with it and maybe how we (as Community) might help autor.

The library is mostly supported by a single author Sébastien Eustace (
https://github.com/sdispater) and it seems like we bump into the situation
which is described in xkcd #2347 (
https://imgs.xkcd.com/comics/dependency.png). To be honest it is not
something new when library mainly supported by one author so there is
always a risk that the library will no longer be supported / abandoned
And if takes in account that pendulum provides core functionality in
Airflow it could have dramatical impact in the future.

Pendulum is a really nice library which helps a lot of developers to work
with dates/datetimes. However there is one major problem, the last release
of this library happened more than 3 years ago (
https://pypi.org/project/pendulum/#history) in the time when Airflow
1.10.11 was released

Fortunately, the project is not abandoned and on a regular basis commits
add into the master branch. However these commits are not included into any
final release and that's why some things related to datetime don't work as
expected in Airflow. There are list of known (for me) issues which are
affect Airflow

*Memory Leak on parse*:
- https://github.com/sdispater/pendulum/issues/720, this one fixed  2 years
ago but not available yet (https://github.com/sdispater/pendulum/pull/563).
Since we use parse dates in airflow codebase: datetime parameters and
datetime in logs this one could be a reason for memory leakage in Airflow:
- https://github.com/apache/airflow/discussions/24694
- https://github.com/apache/airflow/discussions/28597

*Incorrect time zones*, known issues and should be already fixed in master
branch
- https://github.com/sdispater/pendulum/issues/700, Mexico do not use DST
anymore
- https://github.com/sdispater/pendulum/issues/706, Egypt reinstate DST

We add clarification in https://github.com/apache/airflow/pull/30467,
however it seems like there is no other way rather than patching Pendulum
right now.

All these issues should be solved as soon as pendulum 3 is released. The
current announced estimation is end of september/ beginning of October:
https://github.com/sdispater/pendulum/issues/600#issuecomment-1711299677

So in theory we would have a fixed version of pendulum soon, and it might
break something in Airflow but from my point of view it is better than
current status.

However there might be a situation where the release of the pendulum would
be postponed, so maybe better to have a backup plan. What could we do in
this case?

Maybe we should start to use zoneinfo.ZoneInfo instead of pendulum
datetime? https://github.com/apache/airflow/issues/19450
Pros:
- stdlib (python 3.9+)
- In pendulum 3.0 Timezone based on zoneinfo.Zoneinfo

Cons:
- Current serialization model can't deal with backport packages. E.g.
timezone which are serialized in backport_zoneinfo can't be deserialized in
zoneinfo

Maybe we should replace parse datetime with another solution. Does anyone
know a good replacement?

Maybe someone from Airflow Community could propose their help with
maintenance of library:
- https://github.com/sdispater/pendulum/issues/590

Maybe we should get rid of the pendulum at all, as a last resort solution.
I can't imagine how we could do that, because a lot of stuff depends on the
pendulum and removing it would be a breaking change.

----
Best Wishes
*Andrey Anshin*

Reply via email to