Hi Robert, Thanks for your reply and sorry about my delay, I forgot to subscribe to dev list too, my fault. :( We are really excited to know that we can write the Built-in I/O Transforms using xlang and Splittable DoFns. About our use cases:
- A large part of the company uses Go and has a lot of maturity to work with this language. If we can build the pipeline in Go, it makes code reuse much easier. Today we abstract the transformations into a .so library and integrate with java using JNA; - We currently use Dataflow with Scala to do ETL (D-1 and streaming) for building search databases; - We work a lot with massive data migrations from one database to another; Leonardo Reis. On 2022/01/12 23:31:26 Robert Burke wrote: > Hello Leonardo! > > I'm happy to hear of your interest in the Go SDK! The SDK is recently out > of experimental, but is not yet officially supported by Dataflow. (It > works, and we test the SDK on Dataflow, but user support is at the > discretion of the Dataflow side at this time.) > > The short answer is yes, these transforms can be available in future > releases. > > The longer answer is the following: > > There are still some gaps between the Go SDK and Java and Python SDKs. For > some of these we use Cross Language Transforms, which lets pipelines insert > transforms from other SDKs into their Go pipelines. > For example, this allows Go pipelines to make use of the Java KafkaIO > transform. See > https://beam.apache.org/documentation/programming-guide/#1323-using-cross-language-transforms-in-a-go-pipeline > which will be kept up to date with the latest state. > > As described at that link, automatic startup of the expansion service isn't > available yet, but it's almost there. The overall use is being worked on > presently, and should start to become available by v2.37.0 . > > The same mechanism will be used to add BigTable support. I don't know about > Elastic, but if Java has it, Go and Python can wrap it. > > If you're keen on contributing a solution for yourself, if you follow the > example set by Kafka, we would welcome contributions of those wrappers for > the Go SDK too. > > Other than cross language, there's potentially the option to write a native > Go transform. However, the Go SDK doesn't support native unbounded source > transforms. It requires a feature called DoFn Self Checkpointing, that's > not yet implemented in the SDK. It is planned though. This doesn't prevent > streaming IOs from cross language being used though. > > Scalable Bounded transforms can be written using Splittable DoFns however. > https://beam.apache.org/documentation/programming-guide/#splittable-dofns > > Every version, the SDK gets closer to being fully featured with the Beam > Model. It's exciting! > > I'd love to hear more about your use case, so we can see if the Go SDK can > get you there. > > Robert Burke > Beam Go Busybody > > > On Wed, Jan 12, 2022, 1:11 PM Leonardo Reis > wrote: > > > Hello everyone, how are you? > > > > My name is Leonardo and I'm from Brazil. In my current project, we are > > thinking of implementing Apache Beam with Go SDK to run our jobs with > > Dataflow runner. But in our architecture we have some external dependencies > > like Kafka Streams, Bigtable and Elastic and we didn't find any > > transformation for them. > > > > Will these IO transformations exist in future releases? > > > > Do you have any suggestions on how we can handle these dependencies using > > the Go SDK? > > > > Best regards, > > Leonardo Reis > > > > Data Engineer > > (16) 3509-5555 > > [email protected] > > arquivei.com.br > > > > [image: facebook] [image: > > linkedin] [image: instagram] > > > > >
