[
https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105750#comment-17105750
]
Robert Burke commented on BEAM-9959:
------------------------------------
The right overall fix for that is to check for cycles WRT the composites after
the topological sort, and print out that there's a cycle involving the
*composite* node represented by the scope. Anything without the full cycle is
much harder to debug. Further, the individual PTransforms involved should be
fully qualified with their composite parent hierachies to make it easier to
find where these are coming from, and recommend either merging two scopes or
similar, and recommending that the new scope objects be moved to their own
functions with 1 scope per function. This makes the bad construction impossible.
> Mistakes Computing Composite Inputs and Outputs
> -----------------------------------------------
>
> Key: BEAM-9959
> URL: https://issues.apache.org/jira/browse/BEAM-9959
> Project: Beam
> Issue Type: Bug
> Components: sdk-go
> Reporter: Robert Burke
> Assignee: Robert Burke
> Priority: Major
>
> The Go SDK uses a Scope object to manage beam Composites.
> A bug was discovered when consuming a PCollection in both the composite that
> created it, and in a separate composite.
> Further, the Go SDK should verify that the root hypergraph structure is a DAG
> and provides a reasonable error. In particular, the leaf nodes of the graph
> could form a DAG, but due to how the beam.Scope object is used, might cause
> the hypergraph to not be a DAG.
> Eg. It's possible to write the following in the Go SDK.
> PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
> A and C are in a, and B are in b.
> A generates colA
> B consumes colA, and generates colB.
> C consumes colA and colB.
> ```
> a := s.Scope(a)
> b := s.Scope(b)
> colA := beam.Impulse(*a*)
> colB := beam.ParDo(*b*, <doFn>, colA)
> beam.ParDo0(*a*, <doFn>, colA, beam.SideInput{colB})
> ```
> If it doesn't already, the Go SDK must emit a clear error, and fail pipeline
> construction.
> If the affected composites are roots in the graph, the cycle prevents being
> able to topologically sort the root ptransforms for the pipeline graph, which
> can adversely affect runners.
> The recommendation is always to wrap uses of scope in functions or other
> scopes to prevent such incorrect constructions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)