Hi,

​+1 for removing the Combinable annotation​. Approach 1 sounds like the
best option to me.


On 13 January 2016 at 14:11, Till Rohrmann <trohrm...@apache.org> wrote:

> Hi Fabian,
>
> thanks for bringing this issue up. I agree with you that it would be nice
> to remove the Combinable annotation if it is not really needed. Now if
> people are not aware of the Combinable interface then they might think that
> they are actually using combiners by simply implementing CombineFunction.
>
>
​has happened to me :S​



> I would also be in favour of approach 1 combined with a migration guide
> where we highlight possible problems with a faulty combine implementation.
>
>
Migration guide is a ​good idea​!



> Cheers,
> Till
> ​
>
>
​-Vasia.​



> On Wed, Jan 13, 2016 at 1:53 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
> > Hi everybody,
> >
> >
> >
> > As part of the upcoming 1.0 release we are stabilizing Flink's APIs.
> >
> > I would like to start a discussion about removing the Combinable
> annotation
> > from the DataSet API.
> >
> >
> >
> > # The Current State:
> >
> > The DataSet API offers two interface for combine functions,
> CombineFunction
> > and GroupCombineFunction, which can be added to a GroupReduceFunctions
> > (GRF).
> >
> >
> > However, implementing a combine interface does not make a GRF combinable.
> > In addition, the GRF class needs to be annotated with the Combinable
> > annotation. The RichGroupReduceFunction has a default implementation of
> > combine, which forwards the combine parameters to the reduce method. This
> > default implementation is not used, unless the class that extends RGRF
> has
> > the Combinable annotation.
> >
> >
> >
> > In addition to the Combinable annotation, the DataSet API
> > GroupReduceOperator features the setCombinable(boolean) method to
> override
> > a missing or existing Combinable annotation.
> >
> >
> >
> > # My Proposal:
> >
> > I propose to remove the Combinable annotation because it is not required
> > and complicates the definition of combiners. Instead, the combine method
> of
> > a GroupReduceFunction should be executed if it implements one of the
> > combine function interfaces. This would require to remove the default
> > combine implementation of the RichGroupReduceFunction as well.
> >
> > This would be an API breaking change and has a few implications.
> >
> >
> >
> > # Possible Implementation:
> >
> > There are a few ways to implement this change.
> >
> > - Remove Combinable annotation or mark it deprecated (and keep effect)
> >
> > - Remove default combine method from RichGroupReduceFunction or deprecate
> > it
> >
> >
> >
> > Approach 1:
> > - Remove Combinable annotation
> > - Remove default combine() method from RichGroupReduceFunction
> > - Effect:
> >     - All RichGroupReduceFunctions that do either have the Combinable
> > annotation or implement the combine method do not compile anymore
> >     - GroupReduceFunctions that have the Combinable annotation do not
> > compile anymore
> >     - GroupReduceFunctions that implement a combine interface without
> > having the annotation (and not being set to combinable during program
> > construction) will execute the previously not executed combine method.
> This
> > might change the behavior of the program. In worst case, the program
> might
> > silently produce wrong results (or crash) if the combiner implementation
> > was faulty. In best case, the program executes faster.
> >
> >
> >
> > Approach 2:
> > - As Approach 1
> > - In addition extend both combine interfaces with a deprecated marker
> > method. This will ensure that all functions that implement a combinable
> > interface do not compile anymore and need to be fixed. This could prevent
> > the silent failure as in Approach 1, but would also cause an additional
> API
> > breaking change once the deprecated marker method is removed again.
> >
> >
> >
> > Approach 3:
> > - Mark Combinable annotation deprecated
> > - Mark combine() method in RichGroupReduceFunction as deprecated
> > - Effect:
> >     - There'll be a couple of deprecation warnings.
> >     - We face the same problem with silent failures as in Approach 1.
> >     - We have to check if RichGroupReduceFunction's override combine or
> not
> > (can be done with reflection). If the method is not overridden we do not
> > execute it (unless there is a Combinable annotation) and we are fine. If
> it
> > is overridden and no Combinable annotation has been defined, we have the
> > same problem with silent failures as before.
> >     - After we remove the deprecated annotation and method, we have the
> > same effect as with Approach 1.
> >
> >
> >
> > There are more alternatives, but these are the most viable, IMO.
> >
> >
> >
> > I think, if we want to remove the combinable annotation, we should do it
> > now.
> >
> > Given the three options, would go for Approach 1. Yes, breaks a lot of
> code
> > and yes there is the possibility of computing incorrect results.
> Approach 2
> > is safer but would mean another API breaking change in the future.
> Approach
> > 3 comes with fewer breaking changes but has the same problem of silent
> > failures.
> >
> > IMO, the breaking API changes of Approach 1 are even desirable because
> they
> > will make users aware that this feature changed.
> >
> >
> >
> > What do you think?
> >
> >
> >
> > Cheers, Fabian
> >
>

Reply via email to