This is a status update.

At this point 2.33.0 is released, but there are difficulties with accessing the 
tagged versions using the standard go tools. It's currently under investigation.

Using the v2 path in a go program then running `go mod tidy` will populate the 
file with  a pseudo-version rather than the latest tag (v2.33.0)  (eg the line 
looks like 
require github.com/apache/beam/sdks/v2 v2.0.0-20211013181004-a9120e083008 )

While this will work, it's not the desired experience for users at this point. 
Current downside is that the releases are not meaningful targets for some 
reason. However, we retain the other benefits of Go Modules (actual dependency 
versioning, management by go tools).

The issue is some combination of the go tooling [A] , that we added a go mod 
file outside of the repo root [B], and that we did not increment the major 
version (v2 -> v3) when adding the go mod file [C].

[B] From the go documentation, this should be legal and fine, even if it's not 
recommended. This is fortunate because the root of the repo would have played 
poorly with root vendor directory, which the go tools have opinions on.

[C] Incrementing the major version is recommended,in the Go Modules 
documentation, when transitioning to Go Modules. However, it never said it was 
required, nor did it indicate this current failure mode. If anything this 
should be documented in those docs, if it's not another bug. We would not 
necessarily want to declare a global v3 for beam at this time, for just the Go 
SDK, it would become confusing rather quickly. Notionally there are some larger 
breaking changes the Java and Python SDKs would want to make in such an event, 
and thus it's a larger conversation, that is out of scope at this time.

This leaves [A] where some mis-understanding of the documented semantics 
occurred. I certainly expected the tagged version of the non-root go-module to 
be inherited from the parent, not wholesale ignored. As a result, I'll be 
filing a bug against the go tools to determine this, and see what paths forward 
exist.

It's my hope to resolve this before we write a properly Experimental Exit blog 
post for the Go SDK.

Thank you for your patience, and time.
Robert Burke
Beam Go Busybody




On 2021/08/23 18:12:00, Robert Burke <[email protected]> wrote: 
> With 2.32 the LICENSE issue has been fixed [1], and the SDK now uses Go 
> Modules for dependency management, simplifying Go SDK contributions. [2]
> 
> The Module file lives in the sdks/ directory so there's a single Go Module 
> for the whole SDK, tests, examples, and any support code for the container 
> boot builds. This excludes the Go SDK Code katas [3] go modules which can be 
> updated once 2.33.0 has been released.
> 
> PR 15365 [4] adds the SDK containers back to the release builds, and default 
> uses the release specific container for docker execution jobs. For at least 
> the 2.33.0 release this does mean that  manual validation will need to 
> explictly specify RC versions of containers. However, given that the Go SDK 
> container and worker boot process rarely changes, this is unlikely to be an 
> issue.
> 
> At present I'm cleaning up some of the references to experimental, and making 
> it clear that 2.33.0 is the first non-experimental release (even though 
> that's 4-6 weeks out from actual release.) CHANGES.md  will be updated to 
> note the event, but a larger blogpost will happen after the release goes 
> public.
> 
> Cheers,
> Robert Burke
> Defacto Beam Go TL.
> 
> [1] 
> https://pkg.go.dev/github.com/apache/[email protected]+incompatible/sdks/go/pkg/beam
> [2] https://github.com/apache/beam/pull/15323
> [3] https://github.com/apache/beam/tree/master/learning/katas/go
> [4] https://github.com/apache/beam/pull/15365
> 
> On 2021/06/28 23:12:19, Ahmet Altay <[email protected]> wrote: 
> > +1, congratulations & thank you!
> > 
> > On Tue, Jun 22, 2021 at 3:15 PM Robert Burke <[email protected]> wrote:
> > 
> > > Regarding documentation update: Initial PR is
> > > https://github.com/apache/beam/pull/15057 which goes up to section ~4.3.
> > > JIRA link for Programing Guide changes:
> > > https://issues.apache.org/jira/browse/BEAM-12513
> > >
> > >
> > > On 2021/06/17 14:58:54, Robert Burke <[email protected]> wrote:
> > > > Yup!
> > > >
> > > > My immediate plan is to work on incorporating the Go SDK fully into the
> > > > Beam Programming Guide. I've audited the guide, and
> > > > am beginning to add missing content and filling in the Go specific gaps.
> > > > This will be tied to improving the Go Doc with more Go
> > > > specific user documentation that isn't appropriate for the BPG.
> > > >
> > > > My audit of the guide is here:
> > > >
> > > https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
> > > >
> > > > The other sheets focus on features and tests. The feature page looks
> > > worse
> > > > than it is, as it was more productive to focus on what isn't available
> > > than
> > > > what is. That's a snapshot of my actual working sheet but I'll be
> > > updating
> > > > it as needed.
> > > >
> > > > On Thu, Jun 17, 2021, 6:23 AM Ismaël Mejía <[email protected]> wrote:
> > > >
> > > > > Oups forgot to write one question. Will this come with revamped
> > > > > website instructions/doc for golang too?
> > > > >
> > > > > On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía <[email protected]>
> > > wrote:
> > > > > >
> > > > > > Huge +1
> > > > > >
> > > > > > This is definitely something many people have asked about, so it is
> > > > > > great to see it finally happening.
> > > > > >
> > > > > > On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > +1 awesome
> > > > > > >
> > > > > > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke <[email protected]
> > > >
> > > > > wrote:
> > > > > > >>
> > > > > > >> Sounds reasonable to me. I agree. We'll aim to get those (Go
> > > modules
> > > > > and LICENSE issue) done before the 2.32 cut, and certainly before the
> > > 2.33
> > > > > cut if release images aren't added to the 2.32 process.
> > > > > > >>
> > > > > > >> Regarding Go Generics: at some point in the future, we may want a
> > > > > harder break between a newer Generic first API and and the current
> > > version,
> > > > > but there's no rush. Generics/TypeParameters in Go aren't identical to
> > > the
> > > > > feature referred to by that term in Java, C++, Rust, etc, so it'll
> > > take a
> > > > > bit of time for that expertise to develop.
> > > > > > >>
> > > > > > >> However, by the current nature of Go, we had to have pretty
> > > > > sophisticated reflective analysis to handle DoFns and map them to 
> > > > > their
> > > > > graph inputs. So, adding new helpers like a KV, emitter, and Iterator
> > > > > types, shouldn't be too difficult. Changing Go SDK internals to use
> > > > > generics (like the implementation of Stats DoFns like Min, Max, etc)
> > > would
> > > > > also be able to be made transparently to most users, and certainly any
> > > of
> > > > > the framework for execution time handling (the "worker's SDK harness")
> > > > > would be able to be cleaned up if need be. Finally, adding more
> > > > > sophisticated DoFn registration and code generation would be able to
> > > > > replace the optional code generator entirely, saving some users a `go
> > > > > generate` step, simplifying getting improved execution performance.
> > > > > > >>
> > > > > > >> Changing things like making a Type Parameterized PCollection,
> > > would
> > > > > be far more involved, as would trying to use some kind of Apply
> > > format. The
> > > > > lack of Method Overrides prevents the apply chaining approach. Or at
> > > least
> > > > > prevents it from working simply.
> > > > > > >>
> > > > > > >> Finally, Go Generics won't be available until Go 1.18, which 
> > > > > > >> isn't
> > > > > until next year. See https://blog.golang.org/generics-proposal for
> > > > > details.
> > > > > > >>
> > > > > > >> Go 1.17 https://tip.golang.org/doc/go1.17 does include a Register
> > > > > calling convention, leading to a modest performance improvement across
> > > the
> > > > > board.
> > > > > > >>
> > > > > > >> Cheers,
> > > > > > >> Robert Burke
> > > > > > >>
> > > > > > >> On 2021/06/15 18:10:46, Robert Bradshaw <[email protected]>
> > > wrote:
> > > > > > >> > +1 to declaring Golang support out of experimental once the Go
> > > > > Modules
> > > > > > >> > issues are solved. I don't think an SDK needs to support every
> > > > > feature
> > > > > > >> > to be accepted, especially now that we can do cross-language
> > > > > > >> > transforms, and Go definitely supports enough to be quite
> > > useful.
> > > > > (WRT
> > > > > > >> > streaming, my understanding is that Go supports the streaming
> > > model
> > > > > > >> > with windows and timestamps, and runs fine on a streaming
> > > runner,
> > > > > even
> > > > > > >> > if more advanced features like state and timers aren't yet
> > > > > available.)
> > > > > > >> >
> > > > > > >> > This is a great milestone.
> > > > > > >> >
> > > > > > >> > On Tue, Jun 15, 2021 at 10:12 AM Tyson Hamilton <
> > > [email protected]>
> > > > > wrote:
> > > > > > >> > >
> > > > > > >> > > WOW! Big news.
> > > > > > >> > >
> > > > > > >> > > I'm supportive of leaving experimental status after Go 
> > > > > > >> > > Modules
> > > > > are completed and the LICENSE issue is resolved. I don't think that
> > > lacking
> > > > > streaming support is a blocker. The other thing I checked to see was 
> > > > > if
> > > > > there were metrics available on metrics.beam.apache.org, specifically
> > > for
> > > > > measuring code health via post-commit over time, which there are and
> > > the
> > > > > passing test rate is high (Huzzah!). The one thing that surprised me
> > > from
> > > > > your summary is that when Go introduces generics it won't result in 
> > > > > any
> > > > > backwards incompatible changes in Apache Beam. That's great news, but
> > > does
> > > > > it mean there will be a need to support both non-generic and generic
> > > APIs
> > > > > moving forward? It seems like generics will be introduced in the Go
> > > 1.17
> > > > > release (optimistically) in August this year.
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Thu, Jun 10, 2021 at 5:04 PM Robert Burke <
> > > [email protected]>
> > > > > wrote:
> > > > > > >> > >>
> > > > > > >> > >> Hello Beam Community!
> > > > > > >> > >>
> > > > > > >> > >> I propose we stop calling the Apache Beam Go SDK
> > > experimental.
> > > > > > >> > >>
> > > > > > >> > >> This thread is to discuss it as a community, and any
> > > conditions
> > > > > that remain that would prevent the exit.
> > > > > > >> > >>
> > > > > > >> > >> tl;dr;
> > > > > > >> > >> Ask Questions for answers and links! I have both.
> > > > > > >> > >> This entails including it officially in the Release process,
> > > > > removing the various "experimental" text throughout the repo etc,
> > > > > > >> > >> and otherwise treating it like Python and Java. Some Go
> > > specific
> > > > > tasks around dep versioning.
> > > > > > >> > >>
> > > > > > >> > >> The Go SDK implements the beam model efficiently for most
> > > batch
> > > > > tasks, including basic windowing.
> > > > > > >> > >> Apache Beam Go jobs can execute, and are tested on all
> > > Portable
> > > > > runners.
> > > > > > >> > >> The core APIs are not going to change in incompatible ways
> > > going
> > > > > forward.
> > > > > > >> > >> Scalable transforms can be written through SplittableDoFns 
> > > > > > >> > >> or
> > > > > via Cross Language transforms.
> > > > > > >> > >>
> > > > > > >> > >> The SDK isn't 100% feature complete, but keeping it
> > > experimental
> > > > > doesn't help with that any further.
> > > > > > >> > >> Communities grow through contributions and use, and
> > > experimental
> > > > > markers dissuade users.
> > > > > > >> > >> There's plenty to do in order expand what can be done with
> > > the
> > > > > SDK. (Contributions welcome)
> > > > > > >> > >>
> > > > > > >> > >> Why Exit Experimental now?
> > > > > > >> > >>
> > > > > > >> > >> Typically when we call an SDK or API Experimental, it's
> > > because
> > > > > there's a risk that API or behaviors may change significantly.
> > > > > > >> > >> This in turn, leads to additional work for users of the SDK
> > > on
> > > > > every release which leads to sticking to older versions or forking
> > > > > > >> > >> to preserve behavior. Version updates should be looked
> > > forward
> > > > > to, and viewed as having little risk. Further while there's been
> > > > > > >> > >> previous dicussion about what the "low bar" is for a new
> > > SDK, it
> > > > > hasn't been summarily applied to the Go SDK. I feel this has
> > > > > > >> > >> hurt development and contribution of new SDK languages
> > > (inherent
> > > > > difficulty of SDK development notwithstanding).
> > > > > > >> > >>
> > > > > > >> > >> When the SDK was designed, it wasn't entirely clear what the
> > > > > Beam Model should look like in an opinionated language like Go.
> > > > > > >> > >> Their initial take (see
> > > > > https://s.apache.org/beam-go-sdk-design-rfc [0]) goes into detail
> > > what it
> > > > > means for a language without
> > > > > > >> > >> Generics, or overloading, or inheritance to implement the
> > > beam
> > > > > model. One could largely throw away static types (like Python),
> > > > > > >> > >> but this approach rings hollow for Go. It would not do if 
> > > > > > >> > >> the
> > > > > approach couldn't grow and scale to the Beam Model. It's also hard
> > > > > > >> > >> to tell if an API is any good before there are users.
> > > > > > >> > >>
> > > > > > >> > >> Further, in the early days of Portability, there wasn't a
> > > way to
> > > > > write scalable DoFns, dynamically or otherwise. It's an incredible
> > > > > > >> > >> bottleneck to need to do all initial fanout of work on a
> > > single
> > > > > machine, write everything to a Reshuffle, just in order to scale up.
> > > > > > >> > >> Without being able to scale, Beam is little more than
> > > overhead.
> > > > > > >> > >>
> > > > > > >> > >> At this point, both of these needs are met within the Go SDK
> > > for
> > > > > open source.
> > > > > > >> > >>
> > > > > > >> > >> Background
> > > > > > >> > >>
> > > > > > >> > >> The Go SDK has been a part of the beam repo for a few years
> > > now,
> > > > > since it was accidentally merged into master.
> > > > > > >> > >> Since then it's been called experimental, and not officially
> > > > > part of the releases.
> > > > > > >> > >>
> > > > > > >> > >> Of the SDKs, it's was always designed around Beam 
> > > > > > >> > >> Portability
> > > > > first. It never had any "Legacy" (SDK x Runner specific ) workers.
> > > > > > >> > >> It's always used the Beam Pipeline protos and FnAPI to
> > > execute
> > > > > jobs, first with some very experimental code on Dataflow, but now
> > > > > > >> > >> on all portable supported runners, like Flink, Spark, the
> > > Python
> > > > > Portable runner, and Dataflow.
> > > > > > >> > >>
> > > > > > >> > >> API Stability
> > > > > > >> > >>
> > > > > > >> > >> The Go SDK hasn't meaningfully changed it's user API for 
> > > > > > >> > >> DoFn
> > > > > and pipeline construction since it was first merged in, and there are
> > > no
> > > > > > >> > >> changes to that on the horizon that can't be made in a
> > > backwards
> > > > > compatible manner. Largely these are related to New Features, or
> > > > > > >> > >> usability improvements enabled by the advent of Go Generics
> > > > > (think of "real" KV, emitter, and iterator types).
> > > > > > >> > >>
> > > > > > >> > >> It's an open secret that the Go SDK has largely been under
> > > work
> > > > > for use within Google. It's use is called FlumeGo, representing
> > > > > > >> > >> the Apache Beam Go SDK, running on top of Flume, Google's
> > > batch
> > > > > pipeline processing engine. Thus most of the focus on improving
> > > > > > >> > >> batch execution. FlumeGo sees ample use today, and there
> > > hasn't
> > > > > been a call for fundamental changes to the API for ergonomic or
> > > > > > >> > >> usability concerns.
> > > > > > >> > >>
> > > > > > >> > >> Scalability
> > > > > > >> > >>
> > > > > > >> > >> Google could get away without the Go SDK having an SDK side
> > > > > scalability solution as a result of it's integration with Flume.
> > > > > > >> > >> However, those days are now past.
> > > > > > >> > >>
> > > > > > >> > >> The Go SDK now supports SplittableDoFns along with Dynamic
> > > > > Splitting, which supports writing scalable batch transforms natively
> > > > > > >> > >> in the Go SDK.
> > > > > > >> > >> The SDK also supports Cross Language Transforms, with Beam
> > > > > Schema encodings. With it, production hardened transforms
> > > > > > >> > >> from Java and Python are a wrapper away.
> > > > > > >> > >>
> > > > > > >> > >> Presently, Daniel Oliveira (who implemented the SDF side
> > > work,
> > > > > and completed the Xlang work,) is adding a wrapper for the
> > > > > > >> > >> Java Kafka IO using Cross Language Transforms, which is 
> > > > > > >> > >> often
> > > > > been requested. This will also enable use of the Beam SQL
> > > > > > >> > >> transforms that java enables.
> > > > > > >> > >>
> > > > > > >> > >> Features
> > > > > > >> > >>
> > > > > > >> > >> The Go SDK implements the Beam C=core. The Go SDK implements
> > > > > standard coders, allows for user DoFns, and CombineFns and access
> > > > > > >> > >> to core transforms like Flatten, GroupByKey, and features
> > > like
> > > > > Side Inputs, Windowing, and User Metrics.
> > > > > > >> > >> Basic windowing will be fully supported for batch even
> > > through
> > > > > lifted combines in the 2.32.0 release.
> > > > > > >> > >>
> > > > > > >> > >> All of the above enables Beam Go to be versatile for batch
> > > > > execution on portable runners, and for simple streaming pipelines.
> > > > > > >> > >>
> > > > > > >> > >> Repo Testing
> > > > > > >> > >>
> > > > > > >> > >> On precommit the Go SDK runs all it's unit tests. On top of
> > > > > that, it runs all it's integration tests against the Python Portable
> > > runner,
> > > > > > >> > >> making it quick and robust to detect breaking changes 
> > > > > > >> > >> without
> > > > > overspending community resources. Those same tests are also
> > > > > > >> > >> run against Dataflow, Flink, and Spark.
> > > > > > >> > >>
> > > > > > >> > >> The tests are executable against all runners via the
> > > appropriate
> > > > > Go commands (if you've stood up your own job management server),
> > > > > > >> > >> or Gradle commands (which will spin up runner instances for
> > > > > you). Documentation for executing tests and adding new ones
> > > > > > >> > >> is on the wiki. [2] They are accessible to Go developers as
> > > > > they're implemented with the standard Go testing tools.
> > > > > > >> > >>
> > > > > > >> > >> Shortcomings
> > > > > > >> > >> That said, there's still much to do. Let me briefly tell you
> > > > > what doesn't work, and it's up to you to weigh whether they block
> > > > > > >> > >> being out of experimental.
> > > > > > >> > >>
> > > > > > >> > >> At present, only a textio has been implemented as Splittable
> > > > > DoFn.
> > > > > > >> > >> Once the Kafka wrapper is merged in, it will serve as a the
> > > > > first example for future contributions for
> > > > > > >> > >> new transform wrappers for the Go SDK.
> > > > > > >> > >> Transforms and IOs are lacking, but at this point users are
> > > > > empowered to write their own DoFns or wrap existing transforms for
> > > Cross
> > > > > Language use.
> > > > > > >> > >>
> > > > > > >> > >> In the core SDK, more streaming focused features have yet to
> > > be
> > > > > implemented, but they're largely additions to what exists already
> > > > > > >> > >> rather than total rebuilds. Much of the work is definining
> > > how a
> > > > > user specifies their desires, and turning those into the appropriate
> > > > > > >> > >> FnAPI requests at execution time. Back in October I wrote at
> > > > > length on the wiki [1] what's missing for additional streaming
> > > features.
> > > > > > >> > >>
> > > > > > >> > >> While we have bolstered our testing recently, there's likely
> > > > > still more we could test to improve our confidence in the SDK,
> > > > > > >> > >> in particular regarding the included transforms libraries 
> > > > > > >> > >> and
> > > > > examples.
> > > > > > >> > >>
> > > > > > >> > >> Moving Forward
> > > > > > >> > >>
> > > > > > >> > >> My immediate plan is to work on incorporating the Go SDK
> > > fully
> > > > > into the Beam Programming Guide. I've audited the guide [3], and
> > > > > > >> > >> am beginning to add missing content and filling in the Go
> > > > > specific gaps. This will be tied to improving the Go Doc with more Go
> > > > > > >> > >> specific user documentation that isn't appropriate for the
> > > BPG.
> > > > > > >> > >> And resolving the LICENSE issue around the public display of
> > > > > that GoDoc.
> > > > > > >> > >>
> > > > > > >> > >> If this proposal is accepted by a binding vote, I will
> > > > > incorporate the SDK into the release process, and remove the
> > > "experimental"
> > > > > > >> > >> language around the SDK. This largely entails updating the
> > > > > release scripts to also build and publish the Go SDK Docker 
> > > > > containers.
> > > > > > >> > >> As for releasing the code, we're technically already doing 
> > > > > > >> > >> so
> > > > > whenever we tag a release branch [4].
> > > > > > >> > >>
> > > > > > >> > >> The clearest signal to the Go community however will be
> > > > > migrating the SDK to use Go Modules for dependency version control,
> > > > > > >> > >> which Daniel is planning on working on after his Kafka task.
> > > > > This will put our repo infrastructure, SDK contributors, and users
> > > > > > >> > >> on the same footing when it comes to dependency management.
> > > It
> > > > > will remove the "+incompatible" tags one sees on the
> > > > > > >> > >> pkg.go.dev list at [4].
> > > > > > >> > >>
> > > > > > >> > >> I'm very happy to answer any questions you might have about
> > > the
> > > > > SDK, and provide additional links as needed. I intentionally avoided
> > > > > > >> > >> a link barrage in this email, as they can distract from the
> > > > > point: The SDK is ready for folks to use it, we need to tell them that
> > > they
> > > > > can
> > > > > > >> > >> rather than they shouldn't.
> > > > > > >> > >>
> > > > > > >> > >> Robert Burke
> > > > > > >> > >> Defacto Beam Go TL
> > > > > > >> > >>
> > > > > > >> > >> [0] https://s.apache.org/beam-go-sdk-design-rfc
> > > > > > >> > >> [1]
> > > > >
> > > https://cwiki.apache.org/confluence/display/BEAM/Supporting+Streaming+in+the+Go+SDK
> > > > > > >> > >> [2] https://cwiki.apache.org/confluence/display/BEAM/Go+Tips
> > > > > > >> > >> [3]
> > > > >
> > > https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
> > > > > (SDK Audit sheet)
> > > > > > >> > >> [4]
> > > > >
> > > https://pkg.go.dev/github.com/apache/beam/sdks/go/pkg/beam?tab=versions
> > > > > > >> >
> > > > >
> > > >
> > >
> > 
> 

Reply via email to