I certainly like the future-annotations and have idly thought about how
to do this myself.
The only downside I can think of is that it might makes cherry picks
back to the 2.3 release branch more complicated in some cases (mostly
because we generally try to avoid backporting things that aren't bug
fixes.)
So maybe we merge this in the prep stage for 2.4 instead of right now?
-ash
On Mon, Jul 4 2022 at 00:12:41 +0200, Jarek Potiuk <ja...@potiuk.com>
wrote:
Hey everyone,
*TL;DR;* I wanted to discuss that maybe (not really sure- but I
wanted to check with others) we could migrate the whole codebase of
airflow to support PEP 563 - postponed evaluation of annotations. PEP
563 is nicely supported in our case (Python 3.7+) by `from __future__
import annotations` and pyupgrade automated pre-commit does the bulk
of the work of converting our code automatically - both current and
future.
*Is there a POC?*
Yeah. I have a PR that implements it. I actually implemented it today
- literally in a few hours, in-between of other things.
* On one hand it is huge= 3625 files (+16,459 −11,989 lines)
* On the other hand it really only **really** changes type
annotations (so no real code impact) and those changes are
semi-automated. I literally got it "generated" in a couple of hours -
and I can easily either rebase or reapply it on top of new changes.
There are some spelling mistakes to fix and few tests to fix, but
this should be easy.
PR here: <https://github.com/apache/airflow/pull/24816>
*Why ?*
Our code gets more and more type-annotated and it's cool. But it does
not come for free (for now). The types for functions/classes are
evaluated at import time and it takes some computational effort. Also
we have a number of cases where we have "string" forward references.
This is ugly and does not get as good support for auto-completions
and IDEs in general.
PEP 563 solves both problems. There is an alternative PEP 649 that
could be chosen in the future, but I think we can get some benefits
now without waiting for a decision (which even if it comes, will be
only implemented in 3.11 or maybe in 3.12 and in the meantime we
could get some benefits of PEP 563).
1) Runtime performance is improved in case we do not use type-hints
at runtime (we don't) - static type checkers work as usual, but at
runtime those annotations are placed in _annotations_ dictionary and
evaluated only when runtime type-hint checking is used.
2) String forward references are not needed.
No longer `-> 'Type' when you define the type a few lines above,
simple `-> Type` will do.
Those two on its own would be worth it IMHO, but there is more.
3) the new syntax of types from Python 3.10 is available and it's
quite a bit nicer when it comes to Optional usage. Example:
CLIENT_AUTH: Optional[Union[Tuple[str, str], Any]] = None
turns into:
CLIENT_AUTH: tuple[str, str] | Any | None = None
4) Interestingly enough - we do not need to do anything to use the
new syntax. The pyupgrade which we have in our pre-commit will
automatically convert the "old way" of using Optionals to the new one
- as soon as you add `from __future__ import annotations`. This also
means that if anyone uses the "old way" in the future, pre-commit
will catch it and migrate - this is actually a cool learning
experience for all contributors I think as they can learn and get usd
to Python 3.10 syntax today. This conversion is done in my PR, you
can take a look.
5) It is overwhelmingly mostly backwards-compatible for Python 3.7
except if you used runtime type annotation checking and analysis
(which we did not) except inferring multiple outputs in decorators
(literally single usage in the entire codebase). And the way we use
it seems to be able to continue to work after we add some small fixes
- I think just our tests need to be fixed (I think 9 of the failing
tests to fix are falling into this camp).
*Caveats*
Almost none since we are already Python 3.7+. Basically each of our
python files must start with:
from __future__ import annotations
Performance gets improved for imports. The PEP
<https://peps.python.org/pep-0563/> claims that the overhead for the
runtime typing information needed is very small (memory wise) and
since we are almost not using runtime type-hint evaluation memory
usage will be lower because the type hints will not be evaluated at
all during runtime.
And similarly, like we do not care about licenses any more (because
they are added automatically) by pre-commit, it could be the same
with the __future__ import.
PEP 649:
The caveat is that <https://peps.python.org/pep-0649/> - alternative
approach to deferring annotation is still being discussed and PEP 563
is not confirmed.. PEP 649 is the reason why PEP 563 wasr removed
from Python 3.10 - because there is no decision which one will go
forward in the future. Discussion here:
<https://discuss.python.org/t/type-annotations-pep-649-and-pep-563/11363>
However, one could think we are not really affected too much and even
if we have to change for 3.11 it will be as simple. We do not use
some heavy runtime type-hinting checks, and PEP 0649 `from __future__
import co_annotations' seems to be an easy switch.
*How complex is it to migrate*
It is simple. We Just merge
<https://github.com/apache/airflow/pull/24816> and maybe add a
pre-commit to force the __future__ import in all our files. That's it.
I did it in a couple of hours and I think it is all that we need -
the PR I did could likely be merged today and it will be complete.
Adding the "__future__" import is very easily automate-able and
`pyupgrade` does all the rest. There is one exclusion that we have to
make with pyupgrade (gcs.py has "list()" method that interferes with
list Type).
There were also a few type fixes that had to be done - but nothing
major.
That's about it. No drama. Some tests/docs need fixing but looking at
the changes - very few. We can add pre-commits guarding it.
*Options*
In the PR I added the 'from __future__ import annotations` to all
files (semi-automatically). But it does not have to be like that. One
can think that this introduces some "noise". There are plenty of
files that do not need it currently.
On the other hand there is a nice "property" of having it in all the
files:
1) We can even add it in the pre-commit to be injected automatically
(should be simple).
2) Pyupgrade will automatically upgrade any future "forward" refs and
optionals for us
3) we can keep consistent style of typing everywhere and remove
__futures__ when we get to Python 3.10
But we can choose to do half-the way - for example do not add
__future__ in empty __init__.py files. Or something similar.
I am not 100% convinced myself, but maybe
WDYT? Anyone has experience with it? bad/good one?
J.