[ 
https://issues.apache.org/jira/browse/BEAM-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-7026:
------------------------------------

    Assignee:     (was: niklas Hansson)

> Python SDK: Unable to obtain the PCollection for output tags which are not 
> consumed by a downstream step.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-7026
>                 URL: https://issues.apache.org/jira/browse/BEAM-7026
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-harness
>            Reporter: Alex Amato
>            Priority: Major
>
> I noticed that we are not able to convert the output tag+transform to the 
> pcollection name for metrics (element count/mean byte count), if the 
> Pcollections for the outputed tags are not consumed by a downstream step.
> This isn't critical as (1) Arguably there is no pcollection at all. (2) 
> Output but not consumed PCollections are not critical to count metrics on as 
> those can be optomized away entirely (No need to do any work, collect 
> metrics, etc. for an unconsumed pcollection).
> However, we are able to count this, but we are unable to assign a pcollection 
> name for it, as in this case there is no information about that output tag 
> defined in the bundle descriptor. The alternative fix is to make sure that 
> its always available, even if not consumed.
> Pablo and I looked into this a bit, and he believed it would be possible in 
> pvalue.py'sĀ 
> DoOutputsTuple class. This fix would require callingĀ __getitem__ on all tags 
> to initialize them properly. However, I had some trouble doing this, as this 
> class is a bit strange since it overrides __getattr__. I found weird 
> behaviors when adding functionality to this code. I don't really get how the 
> code functions today, as its own instance variable usage should trigger the 
> custom __getattr__ code, yet we seem to be using these attrs normally with 
> self.X usages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to