Zach Moshe created BEAM-5805:
--------------------------------

             Summary: _MaterializedDoOutputsTuple doesn't support __getitem__ 
by integer values
                 Key: BEAM-5805
                 URL: https://issues.apache.org/jira/browse/BEAM-5805
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
            Reporter: Zach Moshe
            Assignee: Ahmet Altay


Consider the following pipeline:

{{with beam.Pipeline(..) as p:  }}

{{  res = p | ... | beam.Partition(..)}}

 

When res is an `_apache_beam.pvalue.DoOutputsTuple_`, it supports access by 
`res[0]` and `res["0"]`. However, if res is a 
`_apache_beam.transforms.ptransform._MaterializedDoOutputsTuple_', integer 
access isn't supported and we must access as strings, although not very 
intuitive considering that `_partition_fn_` returns integers.

 

I'm not familiar with beam internals but briefly looked into the code and I saw 
that __MaterializedDoOutputsTuple overrides the __getitem__() of DoOutputsTuple 
and doesn't have the explicit casting 
([https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L225).]

Also looks like [~gildea] already had a related comment there.

 

Is this on purpose? Can I expect an access-by-int API for Partition() results 
regardless of whether it was materialized or not?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to