[
https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=316344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316344
]
ASF GitHub Bot logged work on BEAM-7746:
----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Sep/19 23:17
Start Date: 22/Sep/19 23:17
Worklog Time Spent: 10m
Work Description: chadrik commented on issue #9056: [BEAM-7746] Add
python type hints
URL: https://github.com/apache/beam/pull/9056#issuecomment-533752642
This PR is getting close. It represents an enormous amount of effort, and
spans a substantial part of the code base, so I'd like to start progressing
towards getting this merged. Rebasing against master has been pretty painless
so far, but I'm afraid that could change soon.
I've broken the PR up into bite-sized commits to help tell a story about
each set of changes. I've also dropped the more complex changes -- the mypy
plugin and generics -- and I'll deal with those in a future PR.
Other than questions and comments you might have about the content of the
PR, the main issues that I need input on are primarily about style.
## Line length
It's very difficult to keep type comments to 80 characters. Not only is
there more information to describe, the type comments themselves cannot span
more than one line. This will improve when we get to python 3.6 and
annotations are an official part of the syntax, since they can be defined over
multiple lines just like normal python objects (which they are).
There are a handful of practices that will minimize our line length, but
even in concert they won't work 100% of the time. Here are the main ones:
- Use `from typing import Foo` vs `import typing`. This is the single most
impactful thing we can do. It also greatly improves legibility. Compare these
two options:
- `typing.Optional[typing.Tuple[typing.Dict[str, str], float]]`
- `Optional[Tuple[Dict[str, str], float]]`
- Use type aliases. e.g. `AwesomeType = Optional[Tuple[Dict[str, str],
float]]`. I prefer to use this sparingly, only when there's a complex type
shared in many places. Here's why:
- quite often we can use duck-typing to reduce the requirement for certain
functions. e.g. there might be a function where `Optional[Tuple[Mapping[str,
str], typing.SupportsFloat]]` would do instead of `AwesomeType`.
- I don't like to have to constantly refer to another location to see the
type
- Change the way that we style functions so that they provide more room for
annotations. e.g. consider these options:
```python
def really_long_function_name(arg1, # type: Optional[Tuple[Dict[str,
str], float]]
arg2, # type: int
):
# type: (...) -> Tuple[str, float]
code
```
```python
def really_long_function_name(
arg1, # type: Optional[Tuple[Dict[str, str], float]]
arg2, # type: int
):
# type: (...) -> Tuple[str, float]
code
```
The beam code seems to favor the former over the latter, though I see
both present. We should decide what our policy will be. In this PR, I've
determined the style on a case-by-case basis, mostly favoring the former.
- Stop using nested classes. This is not a common practice in python anyway.
In addition to these coding practices, we can also add `# type:` to the
regex that excludes long-lines in pylint. This is a bit of an easy out, since
we'll have to deal with these issues once we get to python 3.6-only code, but
at that point we'll have more options at our disposal for managing line length,
as discussed above. I favor this solution, however, if you prefer we can make
it our practice to use `# pylint: disable=line-too-long` *after* the type
comment.
It's a larger conversation, but it might be worth discussing increasing the
line length. Many type-annotated projects have increased their line length to
99 or more characters. This is a big change, that would involve a lot of
debate.
## Unused module imports
Pylint is not able to properly track type annotations used within type
comments (which is the majority), and so generates spurious errors about unused
imports for most of the typing classes. Newer versions of pylint can track
annotations within comments, but only for variable annotations and not for
function annotations, so it's not a complete solution. Even if we think there
is a benefit to a partial solution, it will take some work to get to the newer
version of pylint because it's python3-only.
The solution I'm proposing for now is to simply ignore the problem, by using
the following pattern when importing types:
```python
# pylint: disable=unused-import
from typing import TYPE_CHECKING
from typing import Any
from typing import Callable
from typing import Container
from typing import DefaultDict
from typing import Dict
from typing import Iterable
from typing import Iterator
from typing import List
# pylint: enable=unused-import
```
That will leave it up to developers to get right for now, and when we get to
a pure python3 code-base we can fix anything that's gone astray.
---
Let me know what you think. I'm excited to get this into the code base. If
you open this branch up in an IDE that is properly setup for type annotations,
you'll see that the annotations are inspectable and greatly improve your
ability to comprehend and traverse the code. They take a bit to get used to,
but even the detractors on my team are now converts. Overall, adding typing
to the Beam code base was actually very straight-forward, relatively speaking,
because the code is already well structured and implicitly precise about the
types that are expected (in other words, I discovered there are very few
`Union` types required). That said, I found a number of docstrings that were
wrong about the types expected (I still need to go back and correct a bunch),
so that's the beauty of type annotations: if you get it wrong you'll get an
error!
edit 1: added a note about using a filter to exclude line-too-long error
for lines with `# type: `
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 316344)
Time Spent: 7h 50m (was: 7h 40m)
> Add type hints to python code
> -----------------------------
>
> Key: BEAM-7746
> URL: https://issues.apache.org/jira/browse/BEAM-7746
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Chad Dombrova
> Assignee: Chad Dombrova
> Priority: Major
> Time Spent: 7h 50m
> Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484
> type hints so that I can clearly see what types are required, get completion
> in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here: [https://github.com/apache/beam/pull/9056]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)