[ https://issues.apache.org/jira/browse/BEAM-7984?focusedWorklogId=306764&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306764 ]
ASF GitHub Bot logged work on BEAM-7984: ---------------------------------------- Author: ASF GitHub Bot Created on: 04/Sep/19 22:37 Start Date: 04/Sep/19 22:37 Worklog Time Spent: 10m Work Description: chadrik commented on issue #9344: [BEAM-7984] The coder returned for typehints.List should be IterableCoder URL: https://github.com/apache/beam/pull/9344#issuecomment-528118514 What's happened here is that this change has exposed a difference in the inference code between python2 and python3. Here's the function that's being analyzed differently: ```python def rotate_key(element): """Returns a new key-value pair of the same size but with a different key.""" (key, value) = element return key[-1:] + key[:-1], value ``` python3: ```python In [1]: from apache_beam.typehints.trivial_inference import infer_return_type In [2]: from apache_beam.testing.synthetic_pipeline import rotate_key In [3]: from apache_beam.typehints import Any In [4]: infer_return_type(rotate_key, [Any]) Out[4]: Tuple[List[Any], Any] ``` python2: ```python In [1]: from apache_beam.typehints.trivial_inference import infer_return_type In [2]: from apache_beam.testing.synthetic_pipeline import rotate_key In [3]: from apache_beam.typehints import Any In [4]: infer_return_type(rotate_key, [Any]) Out[4]: Any ``` The _actual_ return type of `rotate_key` is `Tuple[bytes, bytes]`. Previously, `List[Any]` resulted in the `FastPrimitiveCoder`, so even though the hint was wrong in python3, the data was properly round-tripped. After this change, the coder became `IterableCoder[FastPrimitiveCoder]` and so it failed, because the data was in fact `bytes`. Personally, it seems like an overreach to infer that `key[-1:] + key[:-1]` is a `List`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 306764) Time Spent: 3h (was: 2h 50m) > [python] The coder returned for typehints.List should be IterableCoder > ---------------------------------------------------------------------- > > Key: BEAM-7984 > URL: https://issues.apache.org/jira/browse/BEAM-7984 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core > Reporter: Chad Dombrova > Assignee: Chad Dombrova > Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > IterableCoder encodes a list and decodes to list, but > typecoders.registry.get_coder(typehints.List[bytes]) returns a > FastPrimitiveCoder. I don't see any reason why this would be advantageous. -- This message was sent by Atlassian Jira (v8.3.2#803003)