[DISCUSS] Policy on keeping layer alternatives in sync

Fabian Hueske Fri, 26 Sep 2014 01:44:00 -0700

Hi,

as you all know, Flink has a layered architecture with multiple
alternatives for certain levels.
Exampels are:
- Programming APIs: Java, Scala, (and Python in progress)
- Processing Backends: distributed runtime (former Nephele), Java
Collections, (and potentially Tez in the future)


The challenge with multiple alternatives that serve the same purpuse is
that these should be in sync.
A feature that is added to the Java API should also be added to the Scala
API (and other APIs in the future). The same applies to new runtime
strategies and operators, such as outer joins.

I think we need a policy how to keep the features of different layer
alternatives in sync.
With the recent update of the Scala API, a ScalaAPICompletenessTest was
added that checks whether the Scala API offers the same methods as the Java
API. Adding a feature to the Java API breaks the build and requires to
either adapt the Scala API as well or exclude the added methods from the
APICompletenessTest.
While this test is a great tool to make sure that that APIs are synced,
this basically requires that APIs are always synced, i.e., a modification
of the Java API must go with an equivalent change of the Scala API.
If we make this a tight policy and force compatibility at all times,
contributors must know about several different technologies (Scala Compiler
Macros, Python, the implementation details of multiple runtime backends,
...). This sounds like a huge entrance barrier to me.

To make it clear, I am definitely in favor of keeping APIs and backends in
sync.
However, I propose to enforce this only for releases, i.e., allow
out-of-sync APIs on the master branch and fix the APIs for releases.
With this additional requirement, we also need to think twice which
features to add as multiple components of the system will be affected.

What do you guys think?

[DISCUSS] Policy on keeping layer alternatives in sync

Reply via email to