[ 
https://issues.apache.org/jira/browse/BEAM-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-11644:
-----------------------------------
    Fix Version/s: 2.29.0

> translations.pack_combiners optimizer causes breaking change to metrics API
> ---------------------------------------------------------------------------
>
>                 Key: BEAM-11644
>                 URL: https://issues.apache.org/jira/browse/BEAM-11644
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.27.0
>            Reporter: Yifan Mai
>            Assignee: Robert Bradshaw
>            Priority: P1
>             Fix For: 2.29.0
>
>          Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> The translations.pack_combiners optimizer causes a breaking change in the 
> public metrics API. The issue arises because metrics are keyed and queryable 
> by step name, and the step name can change after combiner packing. Suppose we 
> have a pipeline that looks like `pipeline | CombinePerKey(combinefn_1); 
> pipeline | CombinePerKey(combinefn_2)` and both combinefn_1 and combinefn_2 
> increment the same counter per input element. Previously, the result would 
> have two counters, one each for step combinefn_1 and combinefn_2; both will 
> have value num_input_elements. After combiner packing, the result will have 
> one counter for Packed[combinefn_1, combinefn] with value 2 * 
> num_input_elements.
> Unfortunately there is no easy fix for this because the runner has to somehow 
> be aware that a step is a packed step and use the appropriate metrics 
> container for the sub-step.
> The short term workaround is to (1) add a note for 2.27 under known issues 
> and (2) make this phase opt-in in 2.28.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to