[
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=276176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276176
]
ASF GitHub Bot logged work on BEAM-7060:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Jul/19 22:31
Start Date: 12/Jul/19 22:31
Worklog Time Spent: 10m
Work Description: chadrik commented on pull request #9056: [BEAM-7060]
Add python type hints
URL: https://github.com/apache/beam/pull/9056
This is an early look at type annotations for the beam source. It is _not_
attempting to address the issue of porting beam from its own internal typehints
module to the official tying module, so perhaps I should open a new ticket just
for type hinting the source.
There's a lot left to do, including getting this hooked up to tests, but I
wanted to get it out there so that A) I can get feedback before I get too far
along, and B) in case anyone wants to join forces with me.
Some points for discussion / debate:
## type annotation style
In python 2.7, we're limited to type comments, and it can be quite difficult
to keep them under the 80 character limit, since they're not very flexible. I'm
following the lead of code-bases like mypy and uses the following style for
functions with more than one or two arguments:
```python
def __init__(self,
evaluation_context, # type: EvaluationContext
applied_ptransform, # type: AppliedPTransform
input_committed_bundle,
side_inputs
):
```
## typing module import
Some projects import only the typing objects needed:
```python
from typing import Iterable, Tuple
```
Since beam has its own internal `typehints` module I'm importing the
`typing` module to avoid conflicts. However, it has the downside of making
some type comments a lot longer, e.g. `typing.Union[typing.Iterable[str],
str]`, which may make it impossible to stay below 80 characters in some cases
## type checking pipeline construction
I'm writing a mypy plugin to let us properly type check the creation of
pipelines, but due to a
[bug/weakness](https://github.com/python/mypy/issues/6933) it's harder than
expected. I'm making progress but it's not done yet.
## str vs unicode
Currently I'm glossing over this problem. The correct way to do this is to
use `typing.Text` and `typing.AnyStr`, but it's a huge pain in the ass. In my
own projects, I just annotate everything as `str` and handle the few cases
where mypy complains, since the problem will get a lot simpler when we switch
to python3, and `bytes` becomes the exception.
---
I'll probably think of more later, but I'm out of time and want to get this
out the door!
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
<br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website
--- | --- | --- | --- | ---
Non-portable | [](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
Portable | --- | [](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 276176)
Time Spent: 10m
Remaining Estimate: 0h
> Design Py3-compatible typehints annotation support in Beam 3.
> -------------------------------------------------------------
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Valentyn Tymofieiev
> Assignee: Udi Meiri
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of
> the assumptions of this implementation broke as of Python 3.6, see for
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245&view=detail]
> lists several specific typehints-related breakages, prefixed with "TypeHints
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting
> type-annotations, like Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock
> adoption of Beam for Python 3.6+ users. One potential option may be to have
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)