Re: Contributing Twister2 runner to Apache Beam

2020-04-03 Thread Pulasthi Supun Wickramasinghe
Hi Ismaël, Thanks for the update, No problem at all, please take your time and let me know if my assistance is needed, The virus has affected everyone's timetables. I hope you are safe. Best Regards, Pulasthi On Fri, Apr 3, 2020 at 12:14 PM Ismaël Mejía wrote: > Hello Pulasthi, > > Please

Re: Contributing Twister2 runner to Apache Beam

2020-04-03 Thread Ismaël Mejía
Hello Pulasthi, Please excuse me for my delay, I have probably 1/3 of my common available time since the coronavirus lockdown so I have not advanced as expected. I hope to catch up rapidly and ping you. Our expected target of merging it before the 2.21.0 release seems to be hard to get at this

Re: Contributing Twister2 runner to Apache Beam

2020-04-02 Thread Pulasthi Supun Wickramasinghe
Hi Ismaël Did you get some free time to perform a code review on the pull request Best Regards Pulasthi On Tue, Mar 10, 2020 at 3:30 PM Luke Cwik wrote: > I have to disagree. Allowing for runners within the Apache Beam repo and > SDKs that reach into the implementation details of each other

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
I have to disagree. Allowing for runners within the Apache Beam repo and SDKs that reach into the implementation details of each other are usability, feature development, maintenance and complexity problems. The usability issue comes from our public core facing APIs exposing methods that runners

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Kenneth Knowles
I do support all the efforts to get Dataflow, Flink, and Spark to 3 (Fn API). But I disagree with it as a requirement; the whole point of ptransforms with URNs is that if the runner can figure out how to execute it according to semantics, then it is fine. A runner meets (1) and (2) but can only

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
I would like to move away from having runners access APIs that are related to pipeline construction and other internal SDK APIs and I would like for SDKs to not inspect internal runner APIs. This would enable the community to improve each independently without needing to fix the world all the time

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Kenneth Knowles
There are a lot of different meanings to "portable runner". Here are some: (1) A runner that accepts a pipeline proto and either runs it or says it cannot run it (2) A runner that accepts jobs via the job management APIs (3) A runner that executes UDFs via the Fn API (4) A runner that can execute

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
+1 On Tue, Mar 10, 2020 at 12:59 AM Alex Van Boxel wrote: > One last thing, for any runner after this one... wouldn't it be a good > acceptance criteria to only accept portable implementations anymore? > > _/ > _/ Alex Van Boxel > > > On Mon, Mar 9, 2020 at 10:42 PM Ismaël Mejía wrote: > >>

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Alex Van Boxel
One last thing, for any runner after this one... wouldn't it be a good acceptance criteria to only accept portable implementations anymore? _/ _/ Alex Van Boxel On Mon, Mar 9, 2020 at 10:42 PM Ismaël Mejía wrote: > Good points Kenn. I think we mostly agree on what has been discussed in >

Re: Contributing Twister2 runner to Apache Beam

2020-03-09 Thread Ismaël Mejía
Good points Kenn. I think we mostly agree on what has been discussed in this thread the pros/cons of having runners on our repository, but this is probably not the best moment in time to change any policy in that aspect. So if nobody objects I think we can proceed. I am OOO this week so with less

Re: Contributing Twister2 runner to Apache Beam

2020-03-08 Thread Kenneth Knowles
I haven't heard anyone suggest that we need a vote. I haven't heard anyone object to this being merged to master. Some time ago, we mostly decided to favor master instead of branches, because it is so much smoother for contributors and users. So I am poking this thread one last time and otherwise

Re: Contributing Twister2 runner to Apache Beam

2020-03-06 Thread Pulasthi Supun Wickramasinghe
I understand that the discussion is on a more broad level than the Twister2 runner. From my experience developing the runner the main advantage of being inside the beam project was the easy access to the wide range of tests and other core/utility code as Kyle pointed out. Unmerging runners that

Re: Contributing Twister2 runner to Apache Beam

2020-03-05 Thread Robert Bradshaw
I think we will get to a point where it makes sense for runners to live in their own repositories, with their own release cadence, but we're not at that point yet. One prerequisite is a stable API--we're closing in on that with the portability protos, but many (java) runners actually share the

Re: Contributing Twister2 runner to Apache Beam

2020-03-05 Thread Kenneth Knowles
I agree with both of you, mostly :-) The monorepo approach doesn't work/scale well for shipped libraries (name a Google library that silently just works and never causes any dependency problems) and the pain we feel has been constant and increasing, but I don't think we are at the breaking point.

Re: Contributing Twister2 runner to Apache Beam

2020-03-04 Thread Kyle Weaver
> Should runners, current and future, be in the same repository as Beam > core? In the distant past, runners lived in their own repositories, and then were donated to Beam. But Beam's current uber-repo setup allows a lot of convenience. For example, a ton of code (including core functionality and

Re: Contributing Twister2 runner to Apache Beam

2020-03-04 Thread Elliotte Rusty Harold
Generic question without commenting on Twister2 specifically: Should runners, current and future, be in the same repository as Beam core? Can or should they be completely separate products with their own release cycles? Generally, loose coupling leads to more maintainable, reliable projects.

Re: Contributing Twister2 runner to Apache Beam

2020-03-04 Thread Pulasthi Supun Wickramasinghe
Hi I believe the pull request is pretty complete now with the help of Ismaël. Kenn, would you be able to take a look at it and suggest any changes if needed?. The build checks and validations tests are passing at the moment. I will start working on the documentation that you mentioned in an

Contributing Twister2 runner to Apache Beam

2020-02-18 Thread Pulasthi Supun Wickramasinghe
Hi All, I have created the initial pull request [1] to contribute the Twister2 Beam runner to the Apache Beam codebase. More information on Twister2 can be found here[2] and the Twister2 codebase is available here[3]. At the moment only batch mode is supported in the runner, but we are planning