Nahuel Lofeudo created BEAM-6511:
------------------------------------
Summary: AbstractGlobalCombineFn hierarchy is inconsistent
Key: BEAM-6511
URL: https://issues.apache.org/jira/browse/BEAM-6511
Project: Beam
Issue Type: Bug
Components: runner-dataflow
Affects Versions: 2.9.0
Reporter: Nahuel Lofeudo
Assignee: Tyler Akidau
Subclasses of AbstractGlobalCombineFn seem to be arranged in a way that
prevents them from being used with the DataflowRunner.
Subclasses of AbstractGlobalCombineFn are under either CombineFn or
CombineFnWithContext, which seems to be in itself a CombineFn which has access
to PipelineOptions and Side Inputs.
However, the DataflowRunner casts all combiners passed from user code to
CombineFn (see [1]) which prevents combiners that extend CombineFnWithContext
from being used there.
For example:
public class CustomCombinerFn extends
CombineWithContext.CombineFnWithContext<...> \{...}
final PCollectionView<SomeObject<String>> newCollection = oldCollection
.apply("Custom Combiner", Combine.globally(new CustomCombinerFn(filter))
.withSideInputs(filter)
.withoutDefaults()
.asSingletonView());
IMHO either CombineFnWithContext should be a subclass of CombineFn or
DataflowRunner should cast the combiner to AbstractGlobalCombineFn.
[1]
https://github.com/apache/beam/blob/b83b302ef97767e4ca245ea24e8bd40a6692e72c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L514
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)