[
https://issues.apache.org/jira/browse/BEAM-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850987#comment-16850987
]
Robert Burke commented on BEAM-5354:
------------------------------------
I don't believe there's a Jira for that, largely because AFAICT it's only
something that affects Dataflow, and Dataflow predominantely supports use via
the Legacy SDK implementations, rather than pure portable ones. The issue comes
from Dataflow's v1b3 representation of the graph, which is a topologically
ordered set of steps (transforms as nodes, with pcollections being implicit),
rather than the normalized representation of the Portable Pipeline Proto.
The behavior did reveal a subtle, but mostly benign bug in the Go SDK though.
The Go SDK takes it's own internal representation, and converts it to the PPP.
In some cases, (notably CoGBK & Side Inputs), it's required to add additional
PTransforms & PCollections. The additional PTransforms were not being correctly
added as sibling transforms of the side input consumer, and were instead added
at the root transform level. I don't believe any runner based purely on the PPP
would complain though, as having inputs from outside the current scope is
permitted by the model, and that the relations between transforms are
indirected though the collections.
The overall fix would be for Dataflow to stop requiring it's own v1b3
representation, and only requiring the PPP for portable SDKs, rather than
(modulo service+SDK version support horizons etc.). The Go SDK for example
generates the v1b3 directly from the PPP, so the only things stopping the
service from doing so is adding their own translation to spur the migration.
> Side Inputs seems to be non-working in the sdk-go
> -------------------------------------------------
>
> Key: BEAM-5354
> URL: https://issues.apache.org/jira/browse/BEAM-5354
> Project: Beam
> Issue Type: Bug
> Components: sdk-go
> Reporter: Tomas Roos
> Assignee: Robert Burke
> Priority: Major
>
> Running the contains example fails with
>
> {code:java}
> Output i0 for step was not found.
> {code}
> This is because of the call to debug.Head (which internally uses SideInput)
> Removing the following line
> [https://github.com/apache/beam/blob/master/sdks/go/examples/contains/contains.go#L50]
>
> The pipeline executes well.
>
> Executed on id's
>
> go-job-1-1536664417610678545
> vs
> go-job-1-1536664934354466938
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)