Thanks to Kenn Knowles for picking up the review here. The PR is merged, so the new interface ProcessFunction and abstract class InferableFunction class (both of which declare `throws Exception`) are now available in master.
On Fri, Dec 14, 2018 at 12:18 PM Jeff Klukas <[email protected]> wrote: > Checking in on this thread. Anybody interested to review > https://github.com/apache/beam/pull/7160 ? > > There could be some discussion on whether these names are the right names, > but otherwise the only potential objection I see here is the additional > burden on developers to understand the differences between the existing > (SerializableFunction and SimpleFunction) and the new (ProcessFunction and > InferableFunction). I originally planned on marking the existing ones as > deprecated, but decided there are contexts where disallowing checked > exceptions probably makes sense. So we now have 4 objects for developers to > be familiar with rather than 2. > > On Fri, Dec 7, 2018 at 6:54 AM Robert Bradshaw <[email protected]> > wrote: > >> How should we move forward on this? The idea looks good, and even >> comes with a PR to review. Any objections to the names? >> On Wed, Dec 5, 2018 at 12:48 PM Jeff Klukas <[email protected]> wrote: >> > >> > Reminder that I'm looking for review on >> https://github.com/apache/beam/pull/7160 >> > >> > On Thu, Nov 29, 2018, 11:48 AM Jeff Klukas <[email protected] wrote: >> >> >> >> I created a JIRA and a PR for this: >> >> >> >> https://issues.apache.org/jira/browse/BEAM-6150 >> >> https://github.com/apache/beam/pull/7160 >> >> >> >> On naming, I'm proposing that SerializableFunction extend >> ProcessFunction (since this new superinterface is particularly appropriate >> for user code executed inside a ProcessElement method) and that >> SimpleFunction extend InferableFunction (since type information and coder >> inference are what distinguish this from ProcessFunction). >> >> >> >> We originally discussed deprecating SerializableFunction and >> SimpleFunction in favor of the new types, but there appear to be two fairly >> separate use cases for SerializableFunction. It's either defining user code >> that will be executed in a DoFn, in which case I think we always want to >> prefer the new interface that allows declared exceptions. But it's also >> used where the code is to be executed as part of pipeline construction, in >> which case it may be reasonable to want to restrict checked exceptions. In >> any case, deprecating SerializableFunction and SimpleFunction can be >> discussed further in the future. >> >> >> >> >> >> On Wed, Nov 28, 2018 at 9:53 PM Kenneth Knowles <[email protected]> >> wrote: >> >>> >> >>> Nice! A clean solution and an opportunity to bikeshed on names. This >> has everything I love. >> >>> >> >>> Kenn >> >>> >> >>> On Wed, Nov 28, 2018 at 6:43 PM Jeff Klukas <[email protected]> >> wrote: >> >>>> >> >>>> It looks like we can add make the new interface a superinterface for >> the existing SerializableFunction while maintaining binary compatibility >> [0]. >> >>>> >> >>>> We'd have: >> >>>> >> >>>> public interface NewSerializableFunction<InputT, OutputT> extends >> Serializable { >> >>>> OutputT apply(InputT input) throws Exception; >> >>>> } >> >>>> >> >>>> and then modify SerializableFunction to inherit from it: >> >>>> >> >>>> public interface SerializableFunction<InputT, OutputT> extends >> NewSerializableFunction<InputT, OutputT>, Serializable { >> >>>> @Override >> >>>> OutputT apply(InputT input); >> >>>> } >> >>>> >> >>>> >> >>>> IIUC, we can then more or less replace all references to >> SerializableFunction with NewSerializableFunction across the beam codebase >> without having to introduce any new overrides. I'm working on a proof of >> concept now. >> >>>> >> >>>> It's not clear what the actual names for NewSerializableFunction and >> NewSimpleFunction should be. >> >>>> >> >>>> [0] >> https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.4.4 >> >>>> >> >>>> >> >>>> On Mon, Nov 26, 2018 at 9:54 PM Thomas Weise <[email protected]> wrote: >> >>>>> >> >>>>> +1 for introducing the new interface now and deprecating the old >> one. The major version change then provides the opportunity to remove >> deprecated code. >> >>>>> >> >>>>> >> >>>>> On Mon, Nov 26, 2018 at 10:09 AM Lukasz Cwik <[email protected]> >> wrote: >> >>>>>> >> >>>>>> Before 3.0 we will still want to introduce this giving time for >> people to migrate, would it make sense to do that now and deprecate the >> alternatives that it replaces? >> >>>>>> >> >>>>>> On Mon, Nov 26, 2018 at 5:59 AM Jeff Klukas <[email protected]> >> wrote: >> >>>>>>> >> >>>>>>> Picking up this thread again. Based on the feedback from Kenn, >> Reuven, and Romain, it sounds like there's no objection to the idea of >> SimpleFunction and SerializableFunction declaring that they throw >> Exception. So the discussion at this point is about whether there's an >> acceptable way to introduce that change. >> >>>>>>> >> >>>>>>> IIUC correctly, Kenn was suggesting that we need to ensure >> backwards compatibility for existing user code both at runtime and >> recompile, which means we can't simply add the declaration to the existing >> interfaces, since that would cause errors at compile time for user code >> directly invoking SerializableFunction instances. >> >>>>>>> >> >>>>>>> I don't see an obvious way that introducing a new functional >> interface would help without littering the API with more variants (it's >> already a bit confusing that i.e. MapElements has multiple via() methods to >> support three different function interfaces). >> >>>>>>> >> >>>>>>> Perhaps this kind of cleanup is best left for Beam 3.0? >> >>>>>>> >> >>>>>>> On Mon, Oct 15, 2018 at 6:51 PM Reuven Lax <[email protected]> >> wrote: >> >>>>>>>> >> >>>>>>>> Compilation compatibility is a big part of what Beam aims to >> provide with its guarantees. Romain makes a good point that most users are >> not invoking SeralizableFunctions themselves (they are usually invoked >> inside of Beam classes such as MapElements), however I suspect some users >> do these things. >> >>>>>>>> >> >>>>>>>> On Sun, Oct 14, 2018 at 2:38 PM Kenneth Knowles <[email protected]> >> wrote: >> >>>>>>>>> >> >>>>>>>>> Romain has brought up two good aspects of backwards >> compatibility: >> >>>>>>>>> >> >>>>>>>>> - runtime replacement vs recompile >> >>>>>>>>> - consumer (covariant) vs producer (contravariant, such as >> interfaces a user implements) >> >>>>>>>>> >> >>>>>>>>> In this case, I think the best shoice is covariant and >> contravariant (invariant) backwards compat including recompile compat. But >> we shouldn't assume there is one obvious definition of "backwards >> compatibility". >> >>>>>>>>> >> >>>>>>>>> Does it help to introduce a new functional interface? >> >>>>>>>>> >> >>>>>>>>> Kenn >> >>>>>>>>> >> >>>>>>>>> On Sun, Oct 14, 2018 at 6:25 AM Romain Manni-Bucau < >> [email protected]> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Beam does not catch Exception for function usage so it will >> have to do it in some places. >> >>>>>>>>>> >> >>>>>>>>>> A user does not have to execute the function so worse case it >> impacts tests and in any case the most important: it does not impact the >> user until it recompiles the code (= runtime is not impacted). >> >>>>>>>>>> >> >>>>>>>>>> Romain Manni-Bucau >> >>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> Le dim. 14 oct. 2018 à 15:19, Reuven Lax <[email protected]> a >> écrit : >> >>>>>>>>>>> >> >>>>>>>>>>> What in Beam codebase is not ready, and how do we know that >> user code doesn't have the same issue? >> >>>>>>>>>>> >> >>>>>>>>>>> On Sun, Oct 14, 2018 at 6:04 AM Romain Manni-Bucau < >> [email protected]> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>> Hmm, tested also and it works until something in the >> codeflow does not respect that constraint - see >> com.sun.tools.javac.comp.Flow.FlowAnalyzer#errorUncaught. In other words >> beam codebase is not ready for that and will make it fail but it is ok >> cause we can fix it but user code does not rely on that so it is fine to >> update it normally. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Romain Manni-Bucau >> >>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> Le dim. 14 oct. 2018 à 14:39, Reuven Lax <[email protected]> >> a écrit : >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Just tried it, doesn't appear to work :( >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> error: unreported exception Exception; must be caught or >> declared to be thrown >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:55 AM Romain Manni-Bucau < >> [email protected]> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> not with java>=8 AFAIK >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Romain Manni-Bucau >> >>>>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Le dim. 14 oct. 2018 à 10:49, Reuven Lax <[email protected]> >> a écrit : >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> But it means that other functions that call >> SerializableFunctions must now declare exceptions, right? If yes, this is >> incompatible. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:37 AM Romain Manni-Bucau < >> [email protected]> wrote: >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> No, only parameter types and return type is used to >> lookup methods. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Romain Manni-Bucau >> >>>>>>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | >> Book >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Le dim. 14 oct. 2018 à 09:45, Reuven Lax < >> [email protected]> a écrit : >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> I've run into this problem before as well. Doesn't >> changing the signature involve a backwards-incompatible change though? >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> On Wed, Oct 3, 2018 at 5:11 PM Jeff Klukas < >> [email protected]> wrote: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> I'm working on >> https://issues.apache.org/jira/browse/BEAM-5638 to add exception >> handling options to single message transforms in the Java SDK. >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> MapElements' via() method is overloaded to accept >> either a SimpleFunction, a SerializableFunction, or a Contextful, all of >> which are ultimately stored as a Contextful where the mapping functionis >> expected to have signature: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> OutputT apply(InputT element, Context c) throws >> Exception; >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> So Contextful.Fn allows throwing checked exceptions, >> but neither SerializableFunction nor SimpleFunction do. The user-provided >> function has to satisfy the more restrictive signature: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> OutputT apply(InputT input); >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Is there background about why we allow arbitrary >> checked exceptions to be thrown in one case but not the other two? Could we >> consider expanding SerializableFunction and SimpleFunction to the >> following?: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> OutputT apply(InputT input) throws Exception; >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> This would, for example, simplify the implementation >> of ParseJsons and AsJsons, where we have to catch an IOException in >> MapElements#via only to rethrow as RuntimeException. >> >
