This is terrific! Is thread asking for opinions from the community about if it should be accepted? Assuming Google side decision is made to contribute, big +1 from me to include it next to other runners.
On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <[email protected]> wrote: > At Google we have been importing the Apache Beam code base and integrating > it with the Google portion of the codebase that supports the Dataflow > worker. This process is painful as we regularly are making breaking API > changes to support libraries related to running portable pipelines (and > sometimes in other places as well). This has made it sometimes difficult > for PR changes to make changes without either breaking something for Google > or waiting for a Googler to make the change internally (e.g. dependency > updates). > > This code is very similar to the other integrations that exist for runners > such as Flink/Spark/Apex/Samza. It is an adaption layer that sits on top of > an execution engine. There is no super secret awesome stuff as this code > was already publicly visible in the past when it was part of the Google > Cloud Dataflow github repo[1]. > > Process wise the code will need to get approval from Google to be donated > and for it to go through the code donation process but before we attempt to > do that, I was wondering whether the community would object to adding this > code to the master branch? > > The up side is that people can make breaking changes and fix it for all > runners. It will also help Googlers contribute more to the portability > story as it will remove the burden of doing the code import (wasted time) > and it will allow people to develop in master (can have the whole project > loaded in a single IDE). > > The downsides are that this will represent more code and unit tests to > support. > > 1: > https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker >
