[
https://issues.apache.org/jira/browse/BEAM-6740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777066#comment-16777066
]
Etienne Chauchot commented on BEAM-6740:
----------------------------------------
[~kenn] ok so it is indeed a bug that needs to be fixed. It might have worked
in the past otherwise there would not be any special combine.globally
translation. Nevertheless I cannot spot the point in history where it changed.
IMHO, the way I see this is that a given composite transform should be
translated to its inner primitives only if a direct translation is not
registered in the runner. Do you agree?
> Combine.globally translation is never called
> --------------------------------------------
>
> Key: BEAM-6740
> URL: https://issues.apache.org/jira/browse/BEAM-6740
> Project: Beam
> Issue Type: Bug
> Components: runner-core
> Reporter: Etienne Chauchot
> Priority: Major
>
> SDK translates Combine.Globally as a composite transform composed of:
> * Map that assigns Void keys
> * Combine.PerKey
> on spark: As Combine.Perkey uses a spark GBK inside it, the runner adds its
> own translation of Combine.Globally to avoid less performant GBK. This
> translation should be called in place of entering the composite transform
> translation.A pipeline like this:
> {code}
> PCollection<Integer> input = pipeline.apply(Create.of(1, 2, 3, 4, 5, 6, 7, 8,
> 9, 10));
> input.apply(
> Combine.globally(new IntegerCombineFn()));
> {code}
> {code}
> private static class IntegerCombineFn extends Combine.CombineFn<Integer,
> Integer, Integer> {
> @Override
> public Integer createAccumulator() {
> return 0;
> }
> @Override
> public Integer addInput(Integer accumulator, Integer input) {
> return accumulator + input;
> }
> @Override
> public Integer mergeAccumulators(Iterable<Integer> accumulators) {
> Integer result = 0;
> for (Integer value : accumulators) {
> result += value;
> }
> return result;
> }
> @Override
> public Integer extractOutput(Integer accumulator) {
> return accumulator;
> }
> }
> {code}
> is translated as the above composite.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)