Robert Burke created BEAM-7709:
----------------------------------
Summary: Flattening multiple outputs of a ParDoN fails
Key: BEAM-7709
URL: https://issues.apache.org/jira/browse/BEAM-7709
Project: Beam
Issue Type: Bug
Components: sdk-go
Affects Versions: Not applicable
Reporter: Robert Burke
Assignee: Robert Burke
Fix For: Not applicable
If a user does a beam.ParDoN for pardo > 2 and then passes one or more of the
outputs to a flatten, then if the flatten occurs SDK side, it currently creates
multiple flatten nodes, which then triggers the downstream pardo (the DoFn that
consumes the Flatten's output) to be initialized multiple times for a single
bundle.
The fix is to pre-emptively populate the input links with the first created
flatten, so subsequent tracings of the plan use the same flatten node the same
way the Go direct runner does[1]. That would happen in the exec translate code.
[[1]
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/direct/direct.go#L299|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/direct/direct.go#L299]
[[2]
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L493|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L493]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)