+1 to remove overall. We removed all tests for ULR already and when we did that, tests were red. Removing code base is a natural next step.
It is a valid point that we should have a way to run portable pipelines locally with Python ULR. I don't believe that a Java person working with Java SDK should actually debug worker in most cases. If we have a situation when SDK dev have to debug runner retularly, we should improve runner logging and error reporting. This can be a great exercise of improving testability. As well as a good requirement if we want to eventually split mono-repo. --Mikhail On Fri, Apr 26, 2019 at 12:36 PM Boyuan Zhang <boyu...@google.com> wrote: > Another concern from me is, will it be difficult for a Java person (who > developing Java SDK) to figure out what's going on in Python ULR when > debugging? > > On Fri, Apr 26, 2019 at 12:05 PM Kenneth Knowles <k...@apache.org> wrote: > >> Good points. Distilling one single item: can I, today, run the Java SDK's >> suite of ValidatesRunner command against the Python ULR + Java SDK Harness, >> in a single Gradle command? >> >> Kenn >> >> On Fri, Apr 26, 2019 at 9:54 AM Anton Kedin <ke...@google.com> wrote: >> >>> If there is no plans to invest in ULR then it makes sense to remove it. >>> >>> Going forward, however, I think we should try to document the higher >>> level approach we're taking with runners (and portability) now that we have >>> something working and can reflect on it. For example, couple of things that >>> are not 100% clear to me: >>> - if the focus is on python runner for portability efforts, how does >>> java SDK (and other languages) tie into this? E.g. how do we run, test, >>> measure, and develop things (pipelines, aspects of the SDK, runner); >>> - what's our approach to developing new features, should we make sure >>> python runner supports them as early as possible (e.g. schemas and SQL)? >>> - java DirectRunner is still there: >>> - it is still the primary tool for java SDK development purposes, >>> and as Kenn mentioned in the linked threads it adds value by making sure >>> users don't rely on implementation details of specific runners. Do we have >>> a similar story for portable scenarios? >>> - I assume that extra validations in the DirectRunner have impact on >>> performance in various ways (potentially non-deterministic). While this >>> doesn't matter in some cases, it might do in others. Having a local runner >>> that is (better) optimized for execution would probably make more sense for >>> perf measurements, integration tests, and maybe even local production jobs. >>> Is this something potentially worth looking into? >>> >>> Regards, >>> Anton >>> >>> >>> On Fri, Apr 26, 2019 at 4:41 AM Maximilian Michels <m...@apache.org> >>> wrote: >>> >>>> Thanks for following up with this. I have mixed feelings to see the >>>> portable Java DirectRunner go, but I'm in favor of this change because >>>> it removes a lot of code that we do not really make use of. >>>> >>>> -Max >>>> >>>> On 26.04.19 02:58, Kenneth Knowles wrote: >>>> > Thanks for providing all this background on the PR. It is very easy >>>> to >>>> > see where it came from. Definitely nice to have less code and fewer >>>> > things that can break. Perhaps lazy consensus is enough. >>>> > >>>> > Kenn >>>> > >>>> > On Thu, Apr 25, 2019 at 4:01 PM Daniel Oliveira < >>>> danolive...@google.com >>>> > <mailto:danolive...@google.com>> wrote: >>>> > >>>> > Hey everyone, >>>> > >>>> > I made a preliminary PR for removing all the Java Reference Runner >>>> > code (PR-8380 <https://github.com/apache/beam/pull/8380>) since I >>>> > wanted to see if it could be done easily. It seems to be working >>>> > fine, so I wanted to open up this discussion to make sure people >>>> are >>>> > still in agreement on getting rid of this code and that people >>>> don't >>>> > have any concerns. >>>> > >>>> > For those who need additional context about this, this previous >>>> > thread >>>> > < >>>> https://lists.apache.org/thread.html/b235f8ee55a737ea399756edd80b1218ed34d3439f7b0ed59bfa8e40@%3Cdev.beam.apache.org%3E >>>> > >>>> > is where we discussed deprecating the Java Reference Runner (in >>>> some >>>> > places it's called the ULR or Universal Local Runner, but it's the >>>> > same thing). Then there's this thread >>>> > < >>>> https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E >>>> > >>>> > where we discussed removing the code from the repo since it's been >>>> > deprecated. >>>> > >>>> > If no one has any objections to trying to remove the code I'll >>>> have >>>> > someone review the PR I wrote and start a vote to have it merged. >>>> > >>>> > Thanks, >>>> > Daniel Oliveira >>>> > >>>> >>>