kennknowles opened a new issue, #19436:
URL: https://github.com/apache/beam/issues/19436

   I noticed that we are not able to convert the output tag****transform to the 
pcollection name for metrics (element count/mean byte count), if the 
Pcollections for the outputed tags are not consumed by a downstream step.
   
   This isn't critical as (1) Arguably there is no pcollection at all. (2) 
Output but not consumed PCollections are not critical to count metrics on as 
those can be optomized away entirely (No need to do any work, collect metrics, 
etc. for an unconsumed pcollection).
   
   However, we are able to count this, but we are unable to assign a 
pcollection name for it, as in this case there is no information about that 
output tag defined in the bundle descriptor. The alternative fix is to make 
sure that its always available, even if not consumed.
   
   Pablo and I looked into this a bit, and he believed it would be possible in 
pvalue.py'sĀ 
   
   DoOutputsTuple class. This fix would require callingĀ __getitem__ on all tags 
to initialize them properly. However, I had some trouble doing this, as this 
class is a bit strange since it overrides __getattr__. I found weird behaviors 
when adding functionality to this code. I don't really get how the code 
functions today, as its own instance variable usage should trigger the custom 
__getattr__ code, yet we seem to be using these attrs normally with self.X 
usages.
   
   Imported from Jira 
[BEAM-7026](https://issues.apache.org/jira/browse/BEAM-7026). Original Jira may 
contain additional context.
   Reported by: [email protected].


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to