+1

On Thu, Sep 13, 2018 at 12:53 PM Romain Manni-Bucau <[email protected]>
wrote:

> If usable by itself without google karma (can you use a worker without
> dataflow itself?) it sounds awesome otherwise it sounds weird IMHO.
>
> Le jeu. 13 sept. 2018 21:36, Kai Jiang <[email protected]> a écrit :
>
>> +1 (non googler)
>>
>> big help for transparency and for future runners.
>>
>> Best,
>> Kai
>>
>> On Thu, Sep 13, 2018, 11:45 Xinyu Liu <[email protected]> wrote:
>>
>>> Big +1 (non-googler).
>>>
>>> From Samza Runner's perspective, we are very happy to see dataflow
>>> worker code so we can learn and compete :).
>>>
>>> Thanks,
>>> Xinyu
>>>
>>> On Thu, Sep 13, 2018 at 11:34 AM Suneel Marthi <[email protected]>
>>> wrote:
>>>
>>>> +1 (non-googler)
>>>>
>>>> This is a great 👍 move
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Sep 13, 2018, at 2:25 PM, Tim Robertson <[email protected]>
>>>> wrote:
>>>>
>>>> +1 (non googler)
>>>> It sounds pragmatic, helps with transparency should issues arise and
>>>> enables more people to fix.
>>>>
>>>>
>>>> On Thu, Sep 13, 2018 at 8:15 PM Dan Halperin <[email protected]>
>>>> wrote:
>>>>
>>>>> From my perspective as a (non-Google) community member, huge +1.
>>>>>
>>>>> I don't see anything bad for the community about open sourcing more of
>>>>> the probably-most-used runner. While the DirectRunner is probably still 
>>>>> the
>>>>> most referential implementation of Beam, can't hurt to see more working
>>>>> code. Other runners or runner implementors can refer to this code if they
>>>>> want, and ignore it if they don't.
>>>>>
>>>>> In terms of having more code and tests to support, well, that's par
>>>>> for the course. Will this change make the things that need to be done to
>>>>> support them more obvious? (E.g., "this PR is blocked because someone at
>>>>> Google on Dataflow team has to fix something" vs "this PR is blocked
>>>>> because the Apache Beam code in foo/bar/baz is failing, and anyone who can
>>>>> see the code can fix it"). The latter seems like a clear win for the
>>>>> community.
>>>>>
>>>>> (As long as the code donation is handled properly, but that's
>>>>> completely orthogonal and I have no reason to think it wouldn't be.)
>>>>>
>>>>> Thanks,
>>>>> Dan
>>>>>
>>>>> On Thu, Sep 13, 2018 at 11:06 AM Lukasz Cwik <[email protected]> wrote:
>>>>>
>>>>>> Yes, I'm specifically asking the community for opinions as to whether
>>>>>> it should be accepted or not.
>>>>>>
>>>>>> On Thu, Sep 13, 2018 at 10:51 AM Raghu Angadi <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> This is terrific!
>>>>>>>
>>>>>>> Is thread asking for opinions from the community about if it should
>>>>>>> be accepted? Assuming Google side decision is made to contribute, big +1
>>>>>>> from me to include it next to other runners.
>>>>>>>
>>>>>>> On Thu, Sep 13, 2018 at 10:38 AM Lukasz Cwik <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> At Google we have been importing the Apache Beam code base and
>>>>>>>> integrating it with the Google portion of the codebase that supports 
>>>>>>>> the
>>>>>>>> Dataflow worker. This process is painful as we regularly are making
>>>>>>>> breaking API changes to support libraries related to running portable
>>>>>>>> pipelines (and sometimes in other places as well). This has made it
>>>>>>>> sometimes difficult for PR changes to make changes without either 
>>>>>>>> breaking
>>>>>>>> something for Google or waiting for a Googler to make the change 
>>>>>>>> internally
>>>>>>>> (e.g. dependency updates).
>>>>>>>>
>>>>>>>> This code is very similar to the other integrations that exist for
>>>>>>>> runners such as Flink/Spark/Apex/Samza. It is an adaption layer that 
>>>>>>>> sits
>>>>>>>> on top of an execution engine. There is no super secret awesome stuff 
>>>>>>>> as
>>>>>>>> this code was already publicly visible in the past when it was part of 
>>>>>>>> the
>>>>>>>> Google Cloud Dataflow github repo[1].
>>>>>>>>
>>>>>>>> Process wise the code will need to get approval from Google to be
>>>>>>>> donated and for it to go through the code donation process but before 
>>>>>>>> we
>>>>>>>> attempt to do that, I was wondering whether the community would object 
>>>>>>>> to
>>>>>>>> adding this code to the master branch?
>>>>>>>>
>>>>>>>> The up side is that people can make breaking changes and fix it for
>>>>>>>> all runners. It will also help Googlers contribute more to the 
>>>>>>>> portability
>>>>>>>> story as it will remove the burden of doing the code import (wasted 
>>>>>>>> time)
>>>>>>>> and it will allow people to develop in master (can have the whole 
>>>>>>>> project
>>>>>>>> loaded in a single IDE).
>>>>>>>>
>>>>>>>> The downsides are that this will represent more code and unit tests
>>>>>>>> to support.
>>>>>>>>
>>>>>>>> 1:
>>>>>>>> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/hotfix_v1.2/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker
>>>>>>>>
>>>>>>>

Reply via email to