[
https://issues.apache.org/jira/browse/BEAM-7984?focusedWorklogId=306760&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306760
]
ASF GitHub Bot logged work on BEAM-7984:
----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Sep/19 22:33
Start Date: 04/Sep/19 22:33
Worklog Time Spent: 10m
Work Description: chadrik commented on issue #9344: [BEAM-7984] The coder
returned for typehints.List should be IterableCoder
URL: https://github.com/apache/beam/pull/9344#issuecomment-528118514
What's happened here is that this change has exposed a difference in the
inference code between python2 and python3.
Here's the function that's being analyzed differently:
```python
def rotate_key(element):
"""Returns a new key-value pair of the same size but with a different
key."""
(key, value) = element
return key[-1:] + key[:-1], value
```
python3:
```python
In [1]: from apache_beam.typehints.trivial_inference import
infer_return_type
In [2]: from apache_beam.testing.synthetic_pipeline import rotate_key
In [3]: from apache_beam.typehints import Any
In [4]: infer_return_type(rotate_key, [Any])
Out[4]: Tuple[List[Any], Any]
```
python2:
```python
In [1]: from apache_beam.typehints.trivial_inference import
infer_return_type
In [2]: from apache_beam.testing.synthetic_pipeline import rotate_key
In [3]: from apache_beam.typehints import Any
In [4]: infer_return_type(rotate_key, [Any])
Out[4]: Any
```
Perviously, `List[Any]` resulted in the `FastPrimitiveCoder`, so even though
the hint was wrong, the data was properly round-tripped. After this change,
the coder became `IterableCoder[FastPrimitiveCoder]` and so it failed, because
the data was in fact `bytes` (The _actual_ return type of `rotate_key` is
`Tuple[bytes, bytes]`).
Personally, it seems like an overreach to infer that `key[-1:] + key[:-1]`
is a `List`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 306760)
Time Spent: 2h 50m (was: 2h 40m)
> [python] The coder returned for typehints.List should be IterableCoder
> ----------------------------------------------------------------------
>
> Key: BEAM-7984
> URL: https://issues.apache.org/jira/browse/BEAM-7984
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Chad Dombrova
> Assignee: Chad Dombrova
> Priority: Major
> Time Spent: 2h 50m
> Remaining Estimate: 0h
>
> IterableCoder encodes a list and decodes to list, but
> typecoders.registry.get_coder(typehints.List[bytes]) returns a
> FastPrimitiveCoder. I don't see any reason why this would be advantageous.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)