[
https://issues.apache.org/jira/browse/BEAM-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Burke updated BEAM-5378:
-------------------------------
Fix Version/s: 2.35.0
Resolution: Fixed
Status: Resolved (was: Open)
> Ensure all Go SDK examples run successfully
> -------------------------------------------
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
> Issue Type: Bug
> Components: sdk-go
> Affects Versions: Not applicable
> Reporter: Tomas Roos
> Assignee: Robert Burke
> Priority: P3
> Fix For: 2.35.0
>
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for the
> Go SDK in order to see what works and on what runner (direct, dataflow), and
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache
> beam and dataflow it would be a tremendous value to have all examples running
> because many of them have legitimate use-cases behind them.
> {code:java}
> ├── complete
> │ └── autocomplete
> │ └── autocomplete.go
> ├── contains
> │ └── contains.go
> ├── cookbook
> │ ├── combine
> │ │ └── combine.go
> │ ├── filter
> │ │ └── filter.go
> │ ├── join
> │ │ └── join.go
> │ ├── max
> │ │ └── max.go
> │ └── tornadoes
> │ └── tornadoes.go
> ├── debugging_wordcount
> │ └── debugging_wordcount.go
> ├── forest
> │ └── forest.go
> ├── grades
> │ └── grades.go
> ├── minimal_wordcount
> │ └── minimal_wordcount.go
> ├── multiout
> │ └── multiout.go
> ├── pingpong
> │ └── pingpong.go
> ├── streaming_wordcap
> │ └── wordcap.go
> ├── windowed_wordcount
> │ └── windowed_wordcount.go
> ├── wordcap
> │ └── wordcap.go
> ├── wordcount
> │ └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │ └── autocomplete
> │ └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files
> from gs://
> But when running this it yields a error from the top.Largest func (discussed
> in another issue that top.Largest needs to have a serializeable combinator /
> accumulator)
> ➜ autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad
> underlying type: bad field type: bad element: unencodable type: interface
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │ └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do
> with the param passing into the x,y iterator. Frankly I dont know and could
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │ ├── combine
> │ │ └── combine.go
> https://github.com/apache/beam/pull/6474
> │ ├── filter
> │ │ └── filter.go
> Fails go-job-1-1536673624017210012
> 2018-09-11 (15:47:13) Output i0 for step was not found.
> │ ├── join
> │ │ └── join.go
> Working as expected! Whey!
> │ ├── max
> │ │ └── max.go
> Working!
> │ └── tornadoes
> │ └── tornadoes.go
> Working!
> ├── debugging_wordcount
> │ └── debugging_wordcount.go
> Works fine!
> ├── forest
> │ └── forest.go
> Bazinga, all good!
> ├── grades
> │ └── grades.go
> So great!
>
> ├── minimal_wordcount
> │ └── minimal_wordcount.go
> Runs only on direct, implemented PR https://github.com/apache/beam/pull/6386
> ├── multiout
> │ └── multiout.go
> Runs like a boss!
> ├── pingpong
> │ └── pingpong.go
> Stating it can't run on dataflow
> // NOTE(herohde) 2/23/2017: Dataflow does not allow cyclic composite
> structures.
> ├── streaming_wordcap
> │ └── wordcap.go
> Brilliant!
> ├── windowed_wordcount
> │ └── windowed_wordcount.go
> All good!
> ├── wordcap
> │ └── wordcap.go
> Runs fine on direct runner but not on dataflow because of input is local and
> is Using
> textio.Immediate, hence not able to pass in a gs:// path
> This is a won't fix
> ├── wordcount
> │ └── wordcount.go
> All good!
> └── yatzy
> └── yatzy.go
> All good!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)