+1 (non-googler)

This is a great 👍 move

Sent from my iPhone

> On Sep 13, 2018, at 2:25 PM, Tim Robertson <[email protected]> wrote:
> 
> +1 (non googler)
> It sounds pragmatic, helps with transparency should issues arise and enables 
> more people to fix. 
>  
> 
>> On Thu, Sep 13, 2018 at 8:15 PM Dan Halperin <[email protected]> wrote:
>> From my perspective as a (non-Google) community member, huge +1.
>> 
>> I don't see anything bad for the community about open sourcing more of the 
>> probably-most-used runner. While the DirectRunner is probably still the most 
>> referential implementation of Beam, can't hurt to see more working code. 
>> Other runners or runner implementors can refer to this code if they want, 
>> and ignore it if they don't.
>> 
>> In terms of having more code and tests to support, well, that's par for the 
>> course. Will this change make the things that need to be done to support 
>> them more obvious? (E.g., "this PR is blocked because someone at Google on 
>> Dataflow team has to fix something" vs "this PR is blocked because the 
>> Apache Beam code in foo/bar/baz is failing, and anyone who can see the code 
>> can fix it"). The latter seems like a clear win for the community.
>> 
>> (As long as the code donation is handled properly, but that's completely 
>> orthogonal and I have no reason to think it wouldn't be.)
>> 
>> Thanks,
>> Dan
>> 
>>> On Thu, Sep 13, 2018 at 11:06 AM Lukasz Cwik <[email protected]> wrote:
>>> Yes, I'm specifically asking the community for opinions as to whether it 
>>> should be accepted or not.
>>> 
>>>> On Thu, Sep 13, 2018 at 10:51 AM Raghu Angadi <[email protected]> wrote:
>>>> This is terrific! 
>>>> 
>>>> Is thread asking for opinions from the community about if it should be 
>>>> accepted? Assuming Google side decision is made to contribute, big +1 from 
>>>> me to include it next to other runners. 
>>>> 
>>>>> On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <[email protected]> wrote:
>>>>> At Google we have been importing the Apache Beam code base and 
>>>>> integrating it with the Google portion of the codebase that supports the 
>>>>> Dataflow worker. This process is painful as we regularly are making 
>>>>> breaking API changes to support libraries related to running portable 
>>>>> pipelines (and sometimes in other places as well). This has made it 
>>>>> sometimes difficult for PR changes to make changes without either 
>>>>> breaking something for Google or waiting for a Googler to make the change 
>>>>> internally (e.g. dependency updates).
>>>>> 
>>>>> This code is very similar to the other integrations that exist for 
>>>>> runners such as Flink/Spark/Apex/Samza. It is an adaption layer that sits 
>>>>> on top of an execution engine. There is no super secret awesome stuff as 
>>>>> this code was already publicly visible in the past when it was part of 
>>>>> the Google Cloud Dataflow github repo[1].
>>>>> 
>>>>> Process wise the code will need to get approval from Google to be donated 
>>>>> and for it to go through the code donation process but before we attempt 
>>>>> to do that, I was wondering whether the community would object to adding 
>>>>> this code to the master branch?
>>>>> 
>>>>> The up side is that people can make breaking changes and fix it for all 
>>>>> runners. It will also help Googlers contribute more to the portability 
>>>>> story as it will remove the burden of doing the code import (wasted time) 
>>>>> and it will allow people to develop in master (can have the whole project 
>>>>> loaded in a single IDE).
>>>>> 
>>>>> The downsides are that this will represent more code and unit tests to 
>>>>> support.
>>>>> 
>>>>> 1: 
>>>>> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker

Reply via email to