*From: *Maximilian Michels <m...@apache.org> *Date: *Thu, May 9, 2019 at 9:21 AM *To: * <dev@beam.apache.org>
Thanks for sharing your ideas for load testing! > > > According to other contributors knowledge/experience: I noticed that > streaming with KafkaIO is currently supported by wrapping the > ExternalTransform in Python SDK. Do you think that streaming pipelines will > "just work" with the current state of portability if I do the same for > UnboundedSyntheticSource or is there something else missing? > > Basically yes, but it requires a bit more effort than just wrapping > about ExternalTransform. You need to provide an ExternalTransformBuilder > for the transform to be configured externally. > > In portability UnboundedSources can only be supported via SDF. To still > be able to use legacy IO which uses UnboundedSource the Runner has to > supply this capability (which the Flink Runner supports). This will > likely go away if we have an UnboundedSource SDF Wrapper :) > So, FlinkRunner has some sort of special support for executing UnboundedSource via the runner in the portable world ? I see a transform override for bounded sources in PortableRunner [1] but nothing for unbounded sources. Agree, that we cannot properly support cross-language unbounded sources till we have SDF and a proper unbounded source to SDF wrapper. [1] https://github.com/apache/beam/blob/master/runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java#L140 > > As for the stability of the expansion protocol, I think it's relatively > stable and won't changed fundamentally. > > Cheers, > Max > > On 09.05.19 16:21, Łukasz Gajowy wrote: > > My recommendation is that we make sure the protocol is stable and > > implemented on the Python SDK side before starting the Go SDK side, > > since that work is already in progress. > > > > +1 This is exactly the roadmap that I had in mind - start with > > externalizing and using the synthetic sources in Python SDK and then > > proceed with Go. Still worth knowing what's going on there so that's why > > I asked. :) > > > > Thanks, > > Łukasz > > > > czw., 9 maj 2019 o 16:03 Robert Burke <rob...@frantil.com > > <mailto:rob...@frantil.com>> napisał(a): > > > > Currently the Go SDK doesn't have cross Language support > > implemented. My recommendation is that we make sure the protocol is > > stable and implemented on the Python SDK side before starting the Go > > SDK side, since that work is already in progress. > > > > The relevant state of the Go SDK: > > * beam.External exists, for specifying go transforms. (See the Go > > SDK's PubSubIO for an example) > > * the generated go code for the protos, including the Expansions > > service API was refreshed last week. Meaning the work isn't blocked > > on that. > > > > In principle, the work would be to > > * ensure that the SDK side of job submission connects and looks up > > relevant transforms against the Expansion service if needed, and > > does the appropriate pipeline graph surgery. > > *This may be something that's handled as some kind of hook and > > registration process for generally handling external transforms SDK > > side. > > * Have some kind of external transform to specify and configure on > > the Go side. > > > > Most of this can be ported from the Python implementation once it's > > stabilized. > > > > As with all my recommendations on how to do things with the Go SDK, > > feel free to ignore it and forge ahead. I look forward to someone > > tackling this, whenever it happens! > > > > Your friendly neighborhood distributed gopher wrangler, > > Robert Burke > > > > Related: > > PR 8531 [1] begins adding automates testing of the Go SDK against > > Flink, which should assist with ensuring this eventual work keeps > > working. > > > > [1]: https://github.com/apache/beam/pull/8531 > > > > On Thu, May 9, 2019, 6:32 AM Łukasz Gajowy <lgaj...@apache.org > > <mailto:lgaj...@apache.org>> wrote: > > > > Hi, > > > > part of our work that needs to be done to create tests for Core > > Apache Beam operations is to enable both batch and streaming > > testing scenarios in all SDKs (not only Java, so lot's of > > portability usage here). I gathered some thoughts on how (I > > think) this could be done at the current state of Beam: > > > > https://s.apache.org/portable-load-tests > > > > I added some questions at the end of the doc but I will paste > > them here too for visibility: > > > > * What is the status of Cross Language Support for Go SDK? Is > > it non-trivial to enable such support (if it doesn't exist > yet)? > > * According to other contributors knowledge/experience: I > > noticed that streaming with KafkaIO is currently supported > > by wrapping the ExternalTransform in Python SDK. Do you > > think that streaming pipelines will "just work" with the > > current state of portability if I do the same for > > UnboundedSyntheticSource or is there something else missing? > > > > BTW: great to see Cross Language Support happening. Thanks for > > doing this! :) > > > > Thanks, > > Łukasz > > > > > > > > >