Checking in on this thread. Anybody interested to review https://github.com/apache/beam/pull/7160 ?
There could be some discussion on whether these names are the right names, but otherwise the only potential objection I see here is the additional burden on developers to understand the differences between the existing (SerializableFunction and SimpleFunction) and the new (ProcessFunction and InferableFunction). I originally planned on marking the existing ones as deprecated, but decided there are contexts where disallowing checked exceptions probably makes sense. So we now have 4 objects for developers to be familiar with rather than 2. On Fri, Dec 7, 2018 at 6:54 AM Robert Bradshaw <rober...@google.com> wrote: > How should we move forward on this? The idea looks good, and even > comes with a PR to review. Any objections to the names? > On Wed, Dec 5, 2018 at 12:48 PM Jeff Klukas <jklu...@mozilla.com> wrote: > > > > Reminder that I'm looking for review on > https://github.com/apache/beam/pull/7160 > > > > On Thu, Nov 29, 2018, 11:48 AM Jeff Klukas <jklu...@mozilla.com wrote: > >> > >> I created a JIRA and a PR for this: > >> > >> https://issues.apache.org/jira/browse/BEAM-6150 > >> https://github.com/apache/beam/pull/7160 > >> > >> On naming, I'm proposing that SerializableFunction extend > ProcessFunction (since this new superinterface is particularly appropriate > for user code executed inside a ProcessElement method) and that > SimpleFunction extend InferableFunction (since type information and coder > inference are what distinguish this from ProcessFunction). > >> > >> We originally discussed deprecating SerializableFunction and > SimpleFunction in favor of the new types, but there appear to be two fairly > separate use cases for SerializableFunction. It's either defining user code > that will be executed in a DoFn, in which case I think we always want to > prefer the new interface that allows declared exceptions. But it's also > used where the code is to be executed as part of pipeline construction, in > which case it may be reasonable to want to restrict checked exceptions. In > any case, deprecating SerializableFunction and SimpleFunction can be > discussed further in the future. > >> > >> > >> On Wed, Nov 28, 2018 at 9:53 PM Kenneth Knowles <k...@apache.org> > wrote: > >>> > >>> Nice! A clean solution and an opportunity to bikeshed on names. This > has everything I love. > >>> > >>> Kenn > >>> > >>> On Wed, Nov 28, 2018 at 6:43 PM Jeff Klukas <jklu...@mozilla.com> > wrote: > >>>> > >>>> It looks like we can add make the new interface a superinterface for > the existing SerializableFunction while maintaining binary compatibility > [0]. > >>>> > >>>> We'd have: > >>>> > >>>> public interface NewSerializableFunction<InputT, OutputT> extends > Serializable { > >>>> OutputT apply(InputT input) throws Exception; > >>>> } > >>>> > >>>> and then modify SerializableFunction to inherit from it: > >>>> > >>>> public interface SerializableFunction<InputT, OutputT> extends > NewSerializableFunction<InputT, OutputT>, Serializable { > >>>> @Override > >>>> OutputT apply(InputT input); > >>>> } > >>>> > >>>> > >>>> IIUC, we can then more or less replace all references to > SerializableFunction with NewSerializableFunction across the beam codebase > without having to introduce any new overrides. I'm working on a proof of > concept now. > >>>> > >>>> It's not clear what the actual names for NewSerializableFunction and > NewSimpleFunction should be. > >>>> > >>>> [0] > https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.4.4 > >>>> > >>>> > >>>> On Mon, Nov 26, 2018 at 9:54 PM Thomas Weise <t...@apache.org> wrote: > >>>>> > >>>>> +1 for introducing the new interface now and deprecating the old > one. The major version change then provides the opportunity to remove > deprecated code. > >>>>> > >>>>> > >>>>> On Mon, Nov 26, 2018 at 10:09 AM Lukasz Cwik <lc...@google.com> > wrote: > >>>>>> > >>>>>> Before 3.0 we will still want to introduce this giving time for > people to migrate, would it make sense to do that now and deprecate the > alternatives that it replaces? > >>>>>> > >>>>>> On Mon, Nov 26, 2018 at 5:59 AM Jeff Klukas <jklu...@mozilla.com> > wrote: > >>>>>>> > >>>>>>> Picking up this thread again. Based on the feedback from Kenn, > Reuven, and Romain, it sounds like there's no objection to the idea of > SimpleFunction and SerializableFunction declaring that they throw > Exception. So the discussion at this point is about whether there's an > acceptable way to introduce that change. > >>>>>>> > >>>>>>> IIUC correctly, Kenn was suggesting that we need to ensure > backwards compatibility for existing user code both at runtime and > recompile, which means we can't simply add the declaration to the existing > interfaces, since that would cause errors at compile time for user code > directly invoking SerializableFunction instances. > >>>>>>> > >>>>>>> I don't see an obvious way that introducing a new functional > interface would help without littering the API with more variants (it's > already a bit confusing that i.e. MapElements has multiple via() methods to > support three different function interfaces). > >>>>>>> > >>>>>>> Perhaps this kind of cleanup is best left for Beam 3.0? > >>>>>>> > >>>>>>> On Mon, Oct 15, 2018 at 6:51 PM Reuven Lax <re...@google.com> > wrote: > >>>>>>>> > >>>>>>>> Compilation compatibility is a big part of what Beam aims to > provide with its guarantees. Romain makes a good point that most users are > not invoking SeralizableFunctions themselves (they are usually invoked > inside of Beam classes such as MapElements), however I suspect some users > do these things. > >>>>>>>> > >>>>>>>> On Sun, Oct 14, 2018 at 2:38 PM Kenneth Knowles <k...@apache.org> > wrote: > >>>>>>>>> > >>>>>>>>> Romain has brought up two good aspects of backwards > compatibility: > >>>>>>>>> > >>>>>>>>> - runtime replacement vs recompile > >>>>>>>>> - consumer (covariant) vs producer (contravariant, such as > interfaces a user implements) > >>>>>>>>> > >>>>>>>>> In this case, I think the best shoice is covariant and > contravariant (invariant) backwards compat including recompile compat. But > we shouldn't assume there is one obvious definition of "backwards > compatibility". > >>>>>>>>> > >>>>>>>>> Does it help to introduce a new functional interface? > >>>>>>>>> > >>>>>>>>> Kenn > >>>>>>>>> > >>>>>>>>> On Sun, Oct 14, 2018 at 6:25 AM Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Beam does not catch Exception for function usage so it will > have to do it in some places. > >>>>>>>>>> > >>>>>>>>>> A user does not have to execute the function so worse case it > impacts tests and in any case the most important: it does not impact the > user until it recompiles the code (= runtime is not impacted). > >>>>>>>>>> > >>>>>>>>>> Romain Manni-Bucau > >>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Le dim. 14 oct. 2018 à 15:19, Reuven Lax <re...@google.com> a > écrit : > >>>>>>>>>>> > >>>>>>>>>>> What in Beam codebase is not ready, and how do we know that > user code doesn't have the same issue? > >>>>>>>>>>> > >>>>>>>>>>> On Sun, Oct 14, 2018 at 6:04 AM Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Hmm, tested also and it works until something in the codeflow > does not respect that constraint - see > com.sun.tools.javac.comp.Flow.FlowAnalyzer#errorUncaught. In other words > beam codebase is not ready for that and will make it fail but it is ok > cause we can fix it but user code does not rely on that so it is fine to > update it normally. > >>>>>>>>>>>> > >>>>>>>>>>>> Romain Manni-Bucau > >>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Le dim. 14 oct. 2018 à 14:39, Reuven Lax <re...@google.com> > a écrit : > >>>>>>>>>>>>> > >>>>>>>>>>>>> Just tried it, doesn't appear to work :( > >>>>>>>>>>>>> > >>>>>>>>>>>>> error: unreported exception Exception; must be caught or > declared to be thrown > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:55 AM Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> not with java>=8 AFAIK > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Romain Manni-Bucau > >>>>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Le dim. 14 oct. 2018 à 10:49, Reuven Lax <re...@google.com> > a écrit : > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> But it means that other functions that call > SerializableFunctions must now declare exceptions, right? If yes, this is > incompatible. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:37 AM Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> No, only parameter types and return type is used to > lookup methods. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Romain Manni-Bucau > >>>>>>>>>>>>>>>> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Le dim. 14 oct. 2018 à 09:45, Reuven Lax < > re...@google.com> a écrit : > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I've run into this problem before as well. Doesn't > changing the signature involve a backwards-incompatible change though? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Wed, Oct 3, 2018 at 5:11 PM Jeff Klukas < > jklu...@mozilla.com> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I'm working on > https://issues.apache.org/jira/browse/BEAM-5638 to add exception handling > options to single message transforms in the Java SDK. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> MapElements' via() method is overloaded to accept > either a SimpleFunction, a SerializableFunction, or a Contextful, all of > which are ultimately stored as a Contextful where the mapping functionis > expected to have signature: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> OutputT apply(InputT element, Context c) throws > Exception; > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> So Contextful.Fn allows throwing checked exceptions, > but neither SerializableFunction nor SimpleFunction do. The user-provided > function has to satisfy the more restrictive signature: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> OutputT apply(InputT input); > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Is there background about why we allow arbitrary > checked exceptions to be thrown in one case but not the other two? Could we > consider expanding SerializableFunction and SimpleFunction to the > following?: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> OutputT apply(InputT input) throws Exception; > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> This would, for example, simplify the implementation of > ParseJsons and AsJsons, where we have to catch an IOException in > MapElements#via only to rethrow as RuntimeException. >