Bumping the topic. If we want to do this, the sooner we decide, the less code we will have to rewrite. I have some objections/counter proposals to Fabian's proposal of doing it module wise and one module at a time.
First, I do not see a problem of having java/scala code even within one module, especially not if there are clean boundaries. Like we could have API in Scala and optimizer rules/logical nodes written in Java in the same module. However I haven’t previously maintained mixed scala/java code bases before, so I might be missing something here. Secondly this whole migration might and most like will take longer then expected, so that creates a problem for a new code that we will be creating. After making a decision to migrate to Java, almost any new Scala line of code will be immediately a technological debt and we will have to rewrite it to Java later. Thus I would propose first to state our end goal - modules structure and which parts of modules we want to have eventually Scala-free. Secondly taking all steps necessary that will allow us to write new code complaint with our end goal. Only after that we should/could focus on incrementally rewriting the old code. Otherwise we could be stuck/blocked for years writing new code in Scala (and increasing technological debt), because nobody have found a time to rewrite some non important and not actively developed part of some module. Piotrek > On 14 Jun 2018, at 15:34, Fabian Hueske <fhue...@gmail.com> wrote: > > Hi, > > In general, I think this is a good effort. However, it won't be easy and I > think we have to plan this well. > I don't like the idea of having the whole code base fragmented into Java > and Scala code for too long. > > I think we should do this one step at a time and focus on migrating one > module at a time. > IMO, the easiest start would be to port the runtime to Java. > Extracting the API classes into an own module, porting them to Java, and > removing the Scala dependency won't be possible without breaking the API > since a few classes depend on the Scala Table API. > > Best, Fabian > > > 2018-06-14 10:33 GMT+02:00 Till Rohrmann <trohrm...@apache.org>: > >> I think that is a noble and honorable goal and we should strive for it. >> This, however, must be an iterative process given the sheer size of the >> code base. I like the approach to define common Java modules which are used >> by more specific Scala modules and slowly moving classes from Scala to >> Java. Thus +1 for the proposal. >> >> Cheers, >> Till >> >> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <pi...@data-artisans.com> >> wrote: >> >>> Hi, >>> >>> I do not have an experience with how scala and java interacts with each >>> other, so I can not fully validate your proposal, but generally speaking >> +1 >>> from me. >>> >>> Does it also mean, that we should slowly migrate `flink-table-core` to >>> Java? How would you envision it? It would be nice to be able to add new >>> classes/features written in Java and so that they can coexist with old >>> Scala code until we gradually switch from Scala to Java. >>> >>> Piotrek >>> >>>> On 13 Jun 2018, at 11:32, Timo Walther <twal...@apache.org> wrote: >>>> >>>> Hi everyone, >>>> >>>> as you all know, currently the Table & SQL API is implemented in Scala. >>> This decision was made a long-time ago when the initital code base was >>> created as part of a master's thesis. The community kept Scala because of >>> the nice language features that enable a fluent Table API like >>> table.select('field.trim()) and because Scala allows for quick >> prototyping >>> (e.g. multi-line comments for code generation). The committers enforced >> not >>> splitting the code-base into two programming languages. >>>> >>>> However, nowadays the flink-table module more and more becomes an >>> important part in the Flink ecosystem. Connectors, formats, and SQL >> client >>> are actually implemented in Java but need to interoperate with >> flink-table >>> which makes these modules dependent on Scala. As mentioned in an earlier >>> mail thread, using Scala for API classes also exposes member variables >> and >>> methods in Java that should not be exposed to users [1]. Java is still >> the >>> most important API language and right now we treat it as a second-class >>> citizen. I just noticed that you even need to add Scala if you just want >> to >>> implement a ScalarFunction because of method clashes between `public >> String >>> toString()` and `public scala.Predef.String toString()`. >>>> >>>> Given the size of the current code base, reimplementing the entire >>> flink-table code in Java is a goal that we might never reach. However, we >>> should at least treat the symptoms and have this as a long-term goal in >>> mind. My suggestion would be to convert user-facing and runtime classes >> and >>> split the code base into multiple modules: >>>> >>>>> flink-table-java {depends on flink-table-core} >>>> Implemented in Java. Java users can use this. This would require to >>> convert classes like TableEnvironment, Table. >>>> >>>>> flink-table-scala {depends on flink-table-core} >>>> Implemented in Scala. Scala users can use this. >>>> >>>>> flink-table-common >>>> Implemented in Java. Connectors, formats, and UDFs can use this. It >>> contains interface classes such as descriptors, table sink, table source. >>>> >>>>> flink-table-core {depends on flink-table-common and >>> flink-table-runtime} >>>> Implemented in Scala. Contains the current main code base. >>>> >>>>> flink-table-runtime >>>> Implemented in Java. This would require to convert classes in >>> o.a.f.table.runtime but would improve the runtime potentially. >>>> >>>> >>>> What do you think? >>>> >>>> >>>> Regards, >>>> >>>> Timo >>>> >>>> [1] >>> http://apache-flink-mailing-list-archive.1008284.n3. >> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-traits-tp21335.html >>>> >>> >>> >>