I like the idea of having a single PR for a features that touches different
components (APIs, backends) and have multiple people contributing to it to
make it work for all alternatives.
This would ensure a synced code base, but it will take much more time to
get new features in. This might be a problem if a feature is required for
other features or asked for by some users.

I am not sure if the argument of increased workload towards a release is
true.
If the a feature should go into a release, it must be implemented for all
APIs anyway. Maybe the chance that this is done at the end of a release
cycle is even higher, if the feature is lingereing around in a PR and being
available for a subset of the APIs. But who knows...

Chesnay does also have a point here. We might want to distinguish between
first-class APIs (backends) which are always in sync and others which might
be a bit behind...



2014-09-29 9:56 GMT+02:00 Aljoscha Krettek <[email protected]>:

> We could use blocking issues on Jira to mark things that need to be
> resolved before a release.
>
> On Sat, Sep 27, 2014 at 11:53 PM, Chesnay Schepler <
> [email protected]> wrote:
>
> > I agree with Kostas, and believe that postponing will imo straight up not
> > work since people tend to be *very* busy close to a release, even without
> > having to port features to several APIs.
> >
> > I furthermore don't think we will get anywhere by creating one policy to
> > rule them all (especially a rigid one), because there are fundamental
> > differences between a) the APIs b) scope of a feature; and there not
> being
> > a point in setting up a policy when it is very likely that we wont abide
> by
> > it.
> >
> > With the increasing number of API's it's quite a tall order expecting a
> > version for each of them from a single contributor. Even know that would
> be
> > 3 (Java, Scala, Streaming(?)) with 2 more to come in the somewhat near
> > future (Python, SQL (not sure if relevant)). It is a *massive *entry
> > barrier, as well as a major time investment on the contributors part.
> This
> > should also hold for simple features (certainly at the beginning).
> >
> > If (and only if) Scala is as thin as i am made to believe i would be for
> a
> > hard policy here. I would exclude other API`s from this. The overhead
> from
> > getting to know all API's and debugging unfamiliar code would eat up way
> to
> > much time, which could easily break our neck. It's not just about syncing
> > the API's, but doing so in an efficient manner. For them I would much
> > rather have 2-3 people per API that are somewhat responsible for porting
> > these features, preferably in a more concentrated effort (aka batches).
> >
> >
> > On 27.9.2014 21:03, Kostas Tzoumas wrote:
> >
> >> If we allow out-of-sync APIs (and backends) until the time of a release,
> >> aren't we just postponing the syncing problem to the time of the
> release,
> >> which is a pretty bad time to have such a problem?
> >>
> >>
> >> On Fri, Sep 26, 2014 at 8:49 PM, Robert Metzger <[email protected]>
> >> wrote:
> >>
> >>  Hi,
> >>>
> >>> I'm also in favor of having a strict policy regarding the Java and
> Scala
> >>> API.
> >>> In my understanding is the new Scala API a thin layer above the Java
> one,
> >>> so adding new methods should be straightforward (given that there are
> >>> plenty of examples as a reference).
> >>>
> >>> Robert
> >>>
> >>> On Fri, Sep 26, 2014 at 11:04 AM, Ufuk Celebi <[email protected]> wrote:
> >>>
> >>>  Hey Fabian,
> >>>>
> >>>> thanks for bringing this up.
> >>>>
> >>>> I would vote to have a hard policy regarding the Scala and Java API as
> >>>> these are our main user facing APIs.
> >>>>
> >>>> If there was a fundamental problem or language feature, which could
> not
> >>>>
> >>> be
> >>>
> >>>> supported/ported in/to the other API, I would be OK if it was only
> >>>> available in one. But small additions to the APIs like outer joins,
> >>>> which
> >>>> can be in sync should also be in sync.
> >>>>
> >>>> If someone does not want to add the corresponding feature to the other
> >>>> APIs, I would go for a pull request with a request for someone else to
> >>>>
> >>> port
> >>>
> >>>> the missing part it.
> >>>>
> >>>> I think it is very important for users to be able to assume that all
> >>>> APIs
> >>>> have the same "power". Otherwise we might end up in a situation (and I
> >>>> think we already had it with the broadcast variables for a time),
> where
> >>>> users have to pick the API, which matches their use case and not their
> >>>> preference.
> >>>>
> >>>> Best,
> >>>>
> >>>> Ufuk
> >>>>
> >>>> On 26 Sep 2014, at 10:43, Fabian Hueske <[email protected]> wrote:
> >>>>
> >>>>  Hi,
> >>>>>
> >>>>> as you all know, Flink has a layered architecture with multiple
> >>>>> alternatives for certain levels.
> >>>>> Exampels are:
> >>>>> - Programming APIs: Java, Scala, (and Python in progress)
> >>>>> - Processing Backends: distributed runtime (former Nephele), Java
> >>>>> Collections, (and potentially Tez in the future)
> >>>>>
> >>>>> The challenge with multiple alternatives that serve the same purpuse
> is
> >>>>> that these should be in sync.
> >>>>> A feature that is added to the Java API should also be added to the
> >>>>>
> >>>> Scala
> >>>
> >>>> API (and other APIs in the future). The same applies to new runtime
> >>>>> strategies and operators, such as outer joins.
> >>>>>
> >>>>> I think we need a policy how to keep the features of different layer
> >>>>> alternatives in sync.
> >>>>> With the recent update of the Scala API, a ScalaAPICompletenessTest
> was
> >>>>> added that checks whether the Scala API offers the same methods as
> the
> >>>>>
> >>>> Java
> >>>>
> >>>>> API. Adding a feature to the Java API breaks the build and requires
> to
> >>>>> either adapt the Scala API as well or exclude the added methods from
> >>>>>
> >>>> the
> >>>
> >>>> APICompletenessTest.
> >>>>> While this test is a great tool to make sure that that APIs are
> synced,
> >>>>> this basically requires that APIs are always synced, i.e., a
> >>>>>
> >>>> modification
> >>>
> >>>> of the Java API must go with an equivalent change of the Scala API.
> >>>>> If we make this a tight policy and force compatibility at all times,
> >>>>> contributors must know about several different technologies (Scala
> >>>>>
> >>>> Compiler
> >>>>
> >>>>> Macros, Python, the implementation details of multiple runtime
> >>>>>
> >>>> backends,
> >>>
> >>>> ...). This sounds like a huge entrance barrier to me.
> >>>>>
> >>>>> To make it clear, I am definitely in favor of keeping APIs and
> backends
> >>>>>
> >>>> in
> >>>>
> >>>>> sync.
> >>>>> However, I propose to enforce this only for releases, i.e., allow
> >>>>> out-of-sync APIs on the master branch and fix the APIs for releases.
> >>>>> With this additional requirement, we also need to think twice which
> >>>>> features to add as multiple components of the system will be
> affected.
> >>>>>
> >>>>> What do you guys think?
> >>>>>
> >>>>
> >>>>
> >
>

Reply via email to