[
https://issues.apache.org/jira/browse/BEAM-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-11644:
-----------------------------------
Fix Version/s: 2.29.0
> translations.pack_combiners optimizer causes breaking change to metrics API
> ---------------------------------------------------------------------------
>
> Key: BEAM-11644
> URL: https://issues.apache.org/jira/browse/BEAM-11644
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.27.0
> Reporter: Yifan Mai
> Assignee: Robert Bradshaw
> Priority: P1
> Fix For: 2.29.0
>
> Time Spent: 5.5h
> Remaining Estimate: 0h
>
> The translations.pack_combiners optimizer causes a breaking change in the
> public metrics API. The issue arises because metrics are keyed and queryable
> by step name, and the step name can change after combiner packing. Suppose we
> have a pipeline that looks like `pipeline | CombinePerKey(combinefn_1);
> pipeline | CombinePerKey(combinefn_2)` and both combinefn_1 and combinefn_2
> increment the same counter per input element. Previously, the result would
> have two counters, one each for step combinefn_1 and combinefn_2; both will
> have value num_input_elements. After combiner packing, the result will have
> one counter for Packed[combinefn_1, combinefn] with value 2 *
> num_input_elements.
> Unfortunately there is no easy fix for this because the runner has to somehow
> be aware that a step is a packed step and use the appropriate metrics
> container for the sub-step.
> The short term workaround is to (1) add a note for 2.27 under known issues
> and (2) make this phase opt-in in 2.28.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)