[
https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=333061&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333061
]
ASF GitHub Bot logged work on BEAM-7746:
----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Oct/19 04:14
Start Date: 24/Oct/19 04:14
Worklog Time Spent: 10m
Work Description: chadrik commented on issue #9056: [BEAM-7746] Add
python type hints
URL: https://github.com/apache/beam/pull/9056#issuecomment-545735730
Ok, I have what I think are satisfactory answers to all of the problems that
I encountered except for the 4 issues below that I need input on.
@robertwb can you help me find the answers to these, please!
---
```
apache_beam/io/iobase.py:924: error: "SourceBase" has no attribute "coder"
[attr-defined]
```
I can't find any sub-classes of `SourceBase` that have a coder attribute.
Is this safe to remove or is this a dataflow thing?
---
```
apache_beam/runners/worker/statesampler_slow.py:77: error: "StateSampler"
has no attribute "_states_by_name" [attr-defined]
```
`statesampler_slow.StateSampler` does not have `_states_by_name` attribute,
but `statesampler.StateSampler` does. I could add this attribute to
`statesampler_slow.StateSampler`, but I don't think it would be used. The more
straightforward solution may be to edit
`statesampler_slow.StateSampler.reset()` to do nothing. Right now I think it
would error if it ran.
---
```
apache_beam/runners/portability/fn_api_runner_transforms.py:280: error:
Invalid index type "Optional[str]" for "MutableMapping[str, Environment]";
expected type "str" [index]
```
It's unclear to me whether `Stage.environment` is meant to be `str` or
`Optional[str]`. The way that it's initialized it _could_ be `Optional[str]`:
```python
class Stage(object):
"""A set of Transforms that can be sent to the worker for processing."""
def __init__(self,
name, # type: str
transforms, # type: List[beam_runner_api_pb2.PTransform]
downstream_side_inputs=None, # type: Optional[FrozenSet[str]]
must_follow=frozenset(), # type: FrozenSet[Stage]
parent=None, # type: Optional[Stage]
environment=None, # type: Optional[str]
forced_root=False
):
...
if environment is None:
environment = functools.reduce(
self._merge_environments,
(self._extract_environment(t) for t in transforms))
self.environment = environment
...
@staticmethod
def _extract_environment(transform):
# type: (beam_runner_api_pb2.PTransform) -> Optional[str]
if transform.spec.urn in PAR_DO_URNS:
pardo_payload = proto_utils.parse_Bytes(
transform.spec.payload, beam_runner_api_pb2.ParDoPayload)
return pardo_payload.do_fn.environment_id
elif transform.spec.urn in COMBINE_URNS:
combine_payload = proto_utils.parse_Bytes(
transform.spec.payload, beam_runner_api_pb2.CombinePayload)
return combine_payload.combine_fn.environment_id
else:
return None
```
In practice will there will always be at least one ParDo or Combine per
stage? If so we should be asserting that `self.environment is not None` in
`Stage.__init__`.
Alternately, we could assert that `self.environment is not None` just before
this call in `executable_stage_transform`.
The bottom line is that there are currently no guarantees in the code that
`self.environment` is not None at this point, and if it is, it will be an error.
---
```
apache_beam/runners/portability/fn_api_runner.py:933: error:
"ParallelBundleManager" has no attribute "_skip_registration" [attr-defined]
```
I can't find anywhere in the code that refers to `_skip_registration`. Is
this safe to remove?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 333061)
Time Spent: 9.5h (was: 9h 20m)
> Add type hints to python code
> -----------------------------
>
> Key: BEAM-7746
> URL: https://issues.apache.org/jira/browse/BEAM-7746
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Chad Dombrova
> Assignee: Chad Dombrova
> Priority: Major
> Time Spent: 9.5h
> Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484
> type hints so that I can clearly see what types are required, get completion
> in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here: [https://github.com/apache/beam/pull/9056]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)