Hey Till, I'm not sure what the "right" ASF process is, but I wouldn't mind a vote on this in order to make sure that you don't do unnecessary work by replacing the code with Scala.
I for one would be certainly open to it. The only thing that bothers me is the current state of out-of-the-box IDE support. But since there are other successful Scala projects around ;-), which manage to do it, why shouldn't we? @Henry, regarding Akka: I think the main motiviation for moving to Akka (besides the points raised by Stephan and others) is that we actually don't want to bother with low-level thread management, protocols, etc. On Tue, Sep 2, 2014 at 8:32 PM, Henry Saputra <[email protected]> wrote: > HI Till, > > Thanks for opening the discussions and lead the effort and apologize > for late response. > > From what I have gathered so far, there are 2 issues: > 1. Introducing Akka as RPC > 2. Moving to Scala to enable easy access to Akka Scala APIs. > > For no1, if the RPC us used for lower level communications then we > could probably consider Netty as the transport and serialization > protocol (I also have added comment to the JIRA). > Internally, to reduce thread management we could use Akka via Scala > bridge service to make sure we use Scala Akka APIs. > > So addressing no 2, we could mix both Scala and Java in JobManager and > TaskManager. the code that handle async RPC communications between JM > and TM are using Java via Netty, and internal multi-threads or higher > level plane code such as heart beat we could use Akka. > > It does introduce a bit mix between Java and Scala code but we already > have mix of Scala and Java to support APIs so I think we could move > some the internal code to use Scala too as "learning" steps to utilize > Scala for better multi concurrency/ functional programming. > > - Henry > > > > On Sun, Aug 31, 2014 at 4:31 AM, Till Rohrmann <[email protected]> > wrote: > > Hi Daniel, > > > > the RPC rework is discussed in > > https://issues.apache.org/jira/browse/FLINK-1019. Jira is currently down > > due to maintenance reasons. > > > > The ideas to use akka are the following. Akka allows us to reduce the > code > > base which has to be maintained. Especially, we get rid of all the > > multi-threading programming of the rpc service which is always hard to > work > > with. With Akka we would get the heartbeat signal for free, because Akka > > can detect dead actors. Akka uses supervision to handle fault tolerance > as > > well as recovery and it allows an easy forwarding of remote exceptions. > At > > the same time it offers a nice rpc abstraction which easily allows to > > implement asynchronous services. Furthermore, it scales rather well to > > large numbers of nodes and hopefully we get the latencies of Flink a > little > > bit down. > > > > Bests, > > > > Till > > > > > > On Sun, Aug 31, 2014 at 11:35 AM, Daniel Warneke <[email protected]> > wrote: > > > >> Hi, > >> > >> will akka just be used for RPC or are there any plans to expand the > >> actor-based model to further parts of the runtime system? If so, could > you > >> please point me to the discussion thread? > >> > >> Spontaneously, I would say that adding a hard dependency on Scala just > for > >> the sake of having a hip RPC service sounds like a pretty dodgy deal. > >> Therefore, I would like understand how much value akka could bring to > Flink > >> in the long run. The discussion whether to reimplement core components > of > >> the system in Scala should be the second step in my opinion. > >> > >> Bests, > >> > >> Daniel > >> > >> > >> Am 29.08.2014 11:33, schrieb Asterios Katsifodimos: > >> > >> I agree that using Akka's actors from Java results in very ugly code. > >>> Hiding the internals of Akka behind Java reflection looks better but > >>> breaks > >>> the principles of actors. For me it is kind of a deal breaker for using > >>> Akka from Java. I think that Till has more reasons to believe that > Scala > >>> would be a more appropriate for building a new Job/Task Manager. > >>> > >>> I think that this discussion should focus on 4 main aspects: > >>> 1. Performance > >>> 2. Implementability > >>> 3. Maintainability > >>> 4. Available Tools > >>> > >>> 1. Performance: Since that the job of the JobManager and the > TaskManager > >>> is > >>> to 1) exchange messages in order to maintain a distributed state > machine > >>> and 2) setup connections between task managers, 3) detect failures > etc.. > >>> In > >>> these basic operations, performance should not be an issue. Akka was > >>> proven > >>> to scale quite well with very low latency. I guess that the low level > >>> "plumbing" (serialization, connections, etc.) will continue in Java in > >>> order to guarantee high performance. I have no clue on what's happening > >>> with memory management and whether this will be implemented in Java or > >>> Scala and the respective consequences. Please comment. > >>> > >>> 2. Since the Job/Task Manager is going to be essentially implemented > from > >>> scratch, given the power of Akka, it seems to me that the > implementation > >>> will be easier, shorter and less verbose in Scala, given that Till is > >>> comfortable enough with Scala. > >>> > >>> 3. Given #2, maintaining the code and trying out new ideas in Scala > would > >>> take less time and effort. But maintaining low level plumbing in Java > and > >>> high level logic in Scala scares me. Anyone that has done this before > >>> could > >>> comment on this? > >>> > >>> 4. Tools: Robert has raised some issues already but I think that tools > >>> will > >>> get better with time. > >>> > >>> Given the above, I would focus on #3 to be honest. Apart from this, > going > >>> the Scala way sounds like a great idea. I really second Kostas' opinion > >>> that if large changes are going to happen, this is the best moment. > >>> > >>> Cheers, > >>> Asterios > >>> > >>> > >>> > >>> On Fri, Aug 29, 2014 at 1:02 AM, Till Rohrmann < > [email protected]> > >>> wrote: > >>> > >>> I also agree with Robert and Kostas that it has to be a community > >>>> decision. > >>>> I understand the problems with Eclipse and the Scala IDE which is a > pain > >>>> in > >>>> the ass. But eventually these things will be fixed. Maybe we could > also > >>>> talk to the typesafe guy and tell him that this problem bothers us a > lot. > >>>> > >>>> I also believe that the project is not about a specific programming > >>>> language but a problem we want to tackle with Flink. From time to > time it > >>>> might be necessary to adapt the tools in order to reach the goal. In > >>>> fact, > >>>> I don't believe that Scala parts would drive people away from the > >>>> project. > >>>> Instead, Scala enthusiasts would be motivated to join us. > >>>> > >>>> Actually I stumbled across a quote of Leibniz which put's my point of > >>>> view > >>>> quite accurately in a nutshell: > >>>> > >>>> In symbols one observes an advantage in discovery which is greatest > when > >>>> they express the exact nature of a thing briefly and, as it were, > picture > >>>> it; then indeed the labor of thought is wonderfully diminished -- > >>>> Gottfried > >>>> Wilhelm Leibniz > >>>> > >>>> > >>>> On Thu, Aug 28, 2014 at 12:57 PM, Kostas Tzoumas <[email protected] > > > >>>> wrote: > >>>> > >>>> On Thu, Aug 28, 2014 at 11:49 AM, Robert Metzger < > [email protected]> > >>>>> wrote: > >>>>> > >>>>> Changing the programming language of a very important system > component > >>>>>> > >>>>> is > >>>> > >>>>> something we should carefully discuss. > >>>>>> > >>>>>> Definitely agree, I think the community should discuss this very > >>>>> > >>>> carefully. > >>>> > >>>>> > >>>>> I understand that Akka is written in Scala and that it will be much > >>>>>> > >>>>> more > >>>> > >>>>> natural to implement the actor based system using Scala. > >>>>>> I see the following issues that we should consider: > >>>>>> Until now, Flink is clearly a project implemented only in Java. The > >>>>>> > >>>>> Scala > >>>> > >>>>> API basically sits on top of the Java-based runtime. We do not really > >>>>>> depend on Scala (we could easily remove the Scala API if we want > to). > >>>>>> Having code written in Scala in the main system will add a hard > >>>>>> > >>>>> dependency > >>>>> > >>>>>> on a scala version. > >>>>>> Being a pure Java project has some advantages: I think its a fact > that > >>>>>> there are more Java programmers than Scala programmers. So our > chances > >>>>>> > >>>>> of > >>>> > >>>>> attracting new contributors are higher when being a Java project. > >>>>>> On the other hand, we could maybe attract Scala developers to our > >>>>>> > >>>>> project. > >>>>> > >>>>>> But that has not happened (for contributors, not users!) so far for > our > >>>>>> Scala API, so I don't see any reason for that to happen. > >>>>>> > >>>>>> > >>>>>> This is definitely an issue to consider. We need to carefully > weight > >>>>> how > >>>>> important this issue is. If we want to break things, incubation is > the > >>>>> right time to do it. Below are some arguments in favor of breaking > >>>>> > >>>> things, > >>>> > >>>>> but do keep in mind that I am undecided, and I would really like to > see > >>>>> > >>>> the > >>>> > >>>>> community weighing in. > >>>>> > >>>>> First, I would dare say that the primary reason for someone to > >>>>> contribute > >>>>> to Flink so far has not been that the code is written in Java, but > more > >>>>> > >>>> the > >>>> > >>>>> content and nature of the project. Most contributors are Big Data > >>>>> enthusiasts in some way or another. > >>>>> > >>>>> Second, Scala projects have attracted contributors in the past. > >>>>> > >>>>> Third, it should not be too hard for someone that does not know > Scala to > >>>>> contribute to a different component if the interfaces are clear. > >>>>> > >>>>> > >>>>> Another issue is tooling: There are a lot of problems with Scala and > >>>>>> Eclipse: I've recently switched to Eclipse Luna. It seems to be > >>>>>> > >>>>> impossible > >>>>> > >>>>>> to compile Scala code with Luna because ScalaIDE does not properly > cope > >>>>>> with it. > >>>>>> Even with Eclipse versions that are supported by ScalaIDE, you have > to > >>>>>> manually install 3 plugins, some of them are not available in the > >>>>>> > >>>>> Eclipse > >>>> > >>>>> Marketplace. So with a JobManager written in Scala, users can not > just > >>>>>> import our project as a Maven project into Eclipse and start > >>>>>> > >>>>> developing. > >>>> > >>>>> The support for Maven is probably also limited. For example, I don't > >>>>>> > >>>>> know > >>>> > >>>>> if there is a checkstyle plugin for Scala. > >>>>>> > >>>>>> I'm looking forward to hearing other opinions on this issue. As I > said > >>>>>> > >>>>> in > >>>> > >>>>> the beginning, we should exchange arguments on this and think about > it > >>>>>> > >>>>> for > >>>>> > >>>>>> some time before we decide on this. > >>>>>> > >>>>>> Best, > >>>>> > >>>>>> Robert > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Thu, Aug 28, 2014 at 1:08 AM, Till Rohrmann < > [email protected]> > >>>>>> wrote: > >>>>>> > >>>>>> Hi guys, > >>>>>>> > >>>>>>> I currently working on replacing the old rpc infrastructure with an > >>>>>>> > >>>>>> akka > >>>>> > >>>>>> based actor system. In the wake of this change I will reimplement > the > >>>>>>> JobManager and TaskManager which will then be actors. Akka offers a > >>>>>>> > >>>>>> Java > >>>>> > >>>>>> API but the implementation turns out to be very verbose and > >>>>>>> > >>>>>> laborious, > >>>> > >>>>> because Java 6 and 7 do not support lambdas and pattern matching. > >>>>>>> > >>>>>> Using > >>>> > >>>>> Scala instead, would allow a far more succinct and clear > >>>>>>> > >>>>>> implementation > >>>> > >>>>> of > >>>>>> > >>>>>>> the JobManager and TaskManager. Instead of a lot of if statements > >>>>>>> > >>>>>> using > >>>> > >>>>> instanceof to figure out the message type, we could simply use > >>>>>>> > >>>>>> pattern > >>>> > >>>>> matching. Furthermore, the callback functions could simply be Scala's > >>>>>>> anonymous functions. Therefore I would propose to use Scala for > these > >>>>>>> > >>>>>> two > >>>>> > >>>>>> systems. > >>>>>>> > >>>>>>> The Akka system uses the slf4j library as logging interface. > >>>>>>> > >>>>>> Therefore > >>>> > >>>>> I > >>>>> > >>>>>> would also propose to replace the jcl logging system with the slf4j > >>>>>>> > >>>>>> logging > >>>>>> > >>>>>>> system. Since we want to use Akka in many parts of the runtime > system > >>>>>>> > >>>>>> and > >>>>> > >>>>>> it recommends using logback as logging backend, I would also like to > >>>>>>> replace log4j with logback. But this change should inflict only few > >>>>>>> > >>>>>> changes > >>>>>> > >>>>>>> once we established the slf4j logging interface everywhere. > >>>>>>> > >>>>>>> What do you guys think of that idea? > >>>>>>> > >>>>>>> Best regards, > >>>>>>> > >>>>>>> Till > >>>>>>> > >>>>>>> > >> >
