Robert Burke created BEAM-7709:
----------------------------------

             Summary: Flattening multiple outputs of a ParDoN fails
                 Key: BEAM-7709
                 URL: https://issues.apache.org/jira/browse/BEAM-7709
             Project: Beam
          Issue Type: Bug
          Components: sdk-go
    Affects Versions: Not applicable
            Reporter: Robert Burke
            Assignee: Robert Burke
             Fix For: Not applicable


If a user does a beam.ParDoN for pardo > 2  and then passes one or more of the 
outputs to a flatten, then if the flatten occurs SDK side, it currently creates 
multiple flatten nodes, which then triggers the downstream pardo (the DoFn that 
consumes the Flatten's output) to be initialized multiple times for a single 
bundle.

The fix is to pre-emptively populate the input links with the first created 
flatten, so subsequent tracings of the plan use the same flatten node the same 
way the Go direct runner does[1]. That would happen in the exec translate code.


[[1] 
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/direct/direct.go#L299|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/direct/direct.go#L299]

[[2] 
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L493|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L493]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to