Hi Timo, Thanks for initiating this great discussion. Currently when using SQL/TableAPI should include many dependence. In particular, it is not necessary to introduce the specific implementation dependencies which users do not care about. So I am glad to see your proposal, and hope when we consider splitting the API interface into a separate module, so that the user can introduce minimum of dependencies.
So, +1 to [separation of interface and implementation; e.g. `Table` & `TableImpl`] which you mentioned in the google doc. Best, Jincheng Xiaowei Jiang <xiaow...@gmail.com> 于2018年11月22日周四 下午10:50写道: > Hi Timo, thanks for driving this! I think that this is a nice thing to do. > While we are doing this, can we also keep in mind that we want to > eventually have a TableAPI interface only module which users can take > dependency on, but without including any implementation details? > > Xiaowei > > On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fhue...@gmail.com> wrote: > > > Hi Timo, > > > > Thanks for writing up this document. > > I like the new structure and agree to prioritize the porting of the > > flink-table-common classes. > > Since flink-table-runtime is (or should be) independent of the API and > > planner modules, we could start porting these classes once the code is > > split into the new module structure. > > The benefits of a Scala-free flink-table-runtime would be a Scala-free > > execution Jar. > > > > Best, Fabian > > > > > > Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther < > > twal...@apache.org > > >: > > > > > Hi everyone, > > > > > > I would like to continue this discussion thread and convert the outcome > > > into a FLIP such that users and contributors know what to expect in the > > > upcoming releases. > > > > > > I created a design document [1] that clarifies our motivation why we > > > want to do this, how a Maven module structure could look like, and a > > > suggestion for a migration plan. > > > > > > It would be great to start with the efforts for the 1.8 release such > > > that new features can be developed in Java and major refactorings such > > > as improvements to the connectors and external catalog support are not > > > blocked. > > > > > > Please let me know what you think. > > > > > > Regards, > > > Timo > > > > > > [1] > > > > > > > > > https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing > > > > > > > > > Am 02.07.18 um 17:08 schrieb Fabian Hueske: > > > > Hi Piotr, > > > > > > > > thanks for bumping this thread and thanks for Xingcan for the > comments. > > > > > > > > I think the first step would be to separate the flink-table module > into > > > > multiple sub modules. These could be: > > > > > > > > - flink-table-api: All API facing classes. Can be later divided > further > > > > into Java/Scala Table API/SQL > > > > - flink-table-planning: involves all planning (basically everything > we > > do > > > > with Calcite) > > > > - flink-table-runtime: the runtime code > > > > > > > > IMO, a realistic mid-term goal is to have the runtime module and > > certain > > > > parts of the planning module ported to Java. > > > > The api module will be much harder to port because of several > > > dependencies > > > > to Scala core classes (the parser framework, tree iterations, etc.). > > I'm > > > > not saying we should not port this to Java, but it is not clear to me > > > (yet) > > > > how to do it. > > > > > > > > I think flink-table-runtime should not be too hard to port. The code > > does > > > > not make use of many Scala features, i.e., it's writing very > Java-like. > > > > Also, there are not many dependencies and operators can be > individually > > > > ported step-by-step. > > > > For flink-table-planning, we can have certain packages that we port > to > > > Java > > > > like planning rules or plan nodes. The related classes mostly extend > > > > Calcite's Java interfaces/classes and would be natural choices for > > being > > > > ported. The code generation classes will require more effort to port. > > > There > > > > are also some dependencies in planning on the api module that we > would > > > need > > > > to resolve somehow. > > > > > > > > For SQL most work when adding new features is done in the planning > and > > > > runtime modules. So, this separation should already reduce > > "technological > > > > dept" quite a lot. > > > > The Table API depends much more on Scala than SQL. > > > > > > > > Cheers, Fabian > > > > > > > > > > > > > > > > 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingc...@gmail.com>: > > > > > > > >> Hi all, > > > >> > > > >> I also think about this problem these days and here are my thoughts. > > > >> > > > >> 1) We must admit that it’s really a tough task to interoperate with > > Java > > > >> and Scala. E.g., they have different collection types (Scala > > collections > > > >> v.s. java.util.*) and in Java, it's hard to implement a method which > > > takes > > > >> Scala functions as parameters. Considering the major part of the > code > > > base > > > >> is implemented in Java, +1 for this goal from a long-term view. > > > >> > > > >> 2) The ideal solution would be to just expose a Scala API and make > all > > > the > > > >> other parts Scala-free. But I am not sure if it could be achieved > even > > > in a > > > >> long-term. Thus as Timo suggested, keep the Scala codes in > > > >> "flink-table-core" would be a compromise solution. > > > >> > > > >> 3) If the community makes the final decision, maybe any new features > > > >> should be added in Java (regardless of the modules), in order to > > prevent > > > >> the Scala codes from growing. > > > >> > > > >> Best, > > > >> Xingcan > > > >> > > > >> > > > >>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski < > pi...@data-artisans.com> > > > >> wrote: > > > >>> Bumping the topic. > > > >>> > > > >>> If we want to do this, the sooner we decide, the less code we will > > have > > > >> to rewrite. I have some objections/counter proposals to Fabian's > > > proposal > > > >> of doing it module wise and one module at a time. > > > >>> First, I do not see a problem of having java/scala code even within > > one > > > >> module, especially not if there are clean boundaries. Like we could > > have > > > >> API in Scala and optimizer rules/logical nodes written in Java in > the > > > same > > > >> module. However I haven’t previously maintained mixed scala/java > code > > > bases > > > >> before, so I might be missing something here. > > > >>> Secondly this whole migration might and most like will take longer > > then > > > >> expected, so that creates a problem for a new code that we will be > > > >> creating. After making a decision to migrate to Java, almost any new > > > Scala > > > >> line of code will be immediately a technological debt and we will > have > > > to > > > >> rewrite it to Java later. > > > >>> Thus I would propose first to state our end goal - modules > structure > > > and > > > >> which parts of modules we want to have eventually Scala-free. > Secondly > > > >> taking all steps necessary that will allow us to write new code > > > complaint > > > >> with our end goal. Only after that we should/could focus on > > > incrementally > > > >> rewriting the old code. Otherwise we could be stuck/blocked for > years > > > >> writing new code in Scala (and increasing technological debt), > because > > > >> nobody have found a time to rewrite some non important and not > > actively > > > >> developed part of some module. > > > >>> Piotrek > > > >>> > > > >>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fhue...@gmail.com> > wrote: > > > >>>> > > > >>>> Hi, > > > >>>> > > > >>>> In general, I think this is a good effort. However, it won't be > easy > > > >> and I > > > >>>> think we have to plan this well. > > > >>>> I don't like the idea of having the whole code base fragmented > into > > > Java > > > >>>> and Scala code for too long. > > > >>>> > > > >>>> I think we should do this one step at a time and focus on > migrating > > > one > > > >>>> module at a time. > > > >>>> IMO, the easiest start would be to port the runtime to Java. > > > >>>> Extracting the API classes into an own module, porting them to > Java, > > > and > > > >>>> removing the Scala dependency won't be possible without breaking > the > > > API > > > >>>> since a few classes depend on the Scala Table API. > > > >>>> > > > >>>> Best, Fabian > > > >>>> > > > >>>> > > > >>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <trohrm...@apache.org>: > > > >>>> > > > >>>>> I think that is a noble and honorable goal and we should strive > for > > > it. > > > >>>>> This, however, must be an iterative process given the sheer size > of > > > the > > > >>>>> code base. I like the approach to define common Java modules > which > > > are > > > >> used > > > >>>>> by more specific Scala modules and slowly moving classes from > Scala > > > to > > > >>>>> Java. Thus +1 for the proposal. > > > >>>>> > > > >>>>> Cheers, > > > >>>>> Till > > > >>>>> > > > >>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski < > > > >> pi...@data-artisans.com> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Hi, > > > >>>>>> > > > >>>>>> I do not have an experience with how scala and java interacts > with > > > >> each > > > >>>>>> other, so I can not fully validate your proposal, but generally > > > >> speaking > > > >>>>> +1 > > > >>>>>> from me. > > > >>>>>> > > > >>>>>> Does it also mean, that we should slowly migrate > > `flink-table-core` > > > to > > > >>>>>> Java? How would you envision it? It would be nice to be able to > > add > > > >> new > > > >>>>>> classes/features written in Java and so that they can coexist > with > > > old > > > >>>>>> Scala code until we gradually switch from Scala to Java. > > > >>>>>> > > > >>>>>> Piotrek > > > >>>>>> > > > >>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <twal...@apache.org> > > wrote: > > > >>>>>>> > > > >>>>>>> Hi everyone, > > > >>>>>>> > > > >>>>>>> as you all know, currently the Table & SQL API is implemented > in > > > >> Scala. > > > >>>>>> This decision was made a long-time ago when the initital code > base > > > was > > > >>>>>> created as part of a master's thesis. The community kept Scala > > > >> because of > > > >>>>>> the nice language features that enable a fluent Table API like > > > >>>>>> table.select('field.trim()) and because Scala allows for quick > > > >>>>> prototyping > > > >>>>>> (e.g. multi-line comments for code generation). The committers > > > >> enforced > > > >>>>> not > > > >>>>>> splitting the code-base into two programming languages. > > > >>>>>>> However, nowadays the flink-table module more and more becomes > an > > > >>>>>> important part in the Flink ecosystem. Connectors, formats, and > > SQL > > > >>>>> client > > > >>>>>> are actually implemented in Java but need to interoperate with > > > >>>>> flink-table > > > >>>>>> which makes these modules dependent on Scala. As mentioned in an > > > >> earlier > > > >>>>>> mail thread, using Scala for API classes also exposes member > > > variables > > > >>>>> and > > > >>>>>> methods in Java that should not be exposed to users [1]. Java is > > > still > > > >>>>> the > > > >>>>>> most important API language and right now we treat it as a > > > >> second-class > > > >>>>>> citizen. I just noticed that you even need to add Scala if you > > just > > > >> want > > > >>>>> to > > > >>>>>> implement a ScalarFunction because of method clashes between > > `public > > > >>>>> String > > > >>>>>> toString()` and `public scala.Predef.String toString()`. > > > >>>>>>> Given the size of the current code base, reimplementing the > > entire > > > >>>>>> flink-table code in Java is a goal that we might never reach. > > > >> However, we > > > >>>>>> should at least treat the symptoms and have this as a long-term > > goal > > > >> in > > > >>>>>> mind. My suggestion would be to convert user-facing and runtime > > > >> classes > > > >>>>> and > > > >>>>>> split the code base into multiple modules: > > > >>>>>>>> flink-table-java {depends on flink-table-core} > > > >>>>>>> Implemented in Java. Java users can use this. This would > require > > to > > > >>>>>> convert classes like TableEnvironment, Table. > > > >>>>>>>> flink-table-scala {depends on flink-table-core} > > > >>>>>>> Implemented in Scala. Scala users can use this. > > > >>>>>>> > > > >>>>>>>> flink-table-common > > > >>>>>>> Implemented in Java. Connectors, formats, and UDFs can use > this. > > It > > > >>>>>> contains interface classes such as descriptors, table sink, > table > > > >> source. > > > >>>>>>>> flink-table-core {depends on flink-table-common and > > > >>>>>> flink-table-runtime} > > > >>>>>>> Implemented in Scala. Contains the current main code base. > > > >>>>>>> > > > >>>>>>>> flink-table-runtime > > > >>>>>>> Implemented in Java. This would require to convert classes in > > > >>>>>> o.a.f.table.runtime but would improve the runtime potentially. > > > >>>>>>> > > > >>>>>>> What do you think? > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> Regards, > > > >>>>>>> > > > >>>>>>> Timo > > > >>>>>>> > > > >>>>>>> [1] > > > >>>>>> http://apache-flink-mailing-list-archive.1008284.n3. > > > >>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into- > > > >> traits-tp21335.html > > > >>>>>> > > > >> > > > > > > > > >