Should be a separate thread with [VOTE] in the subject line, a clear description of what we are voting for, and the duration of the vote (typically 72 hours).
On Wed, Sep 3, 2014 at 10:14 PM, Till Rohrmann <[email protected]> wrote: > How do we then start the vote on whether we should implement the JobManager > with Scala or not? Can we just do it in this thread or should it happen in > a separate thread? > > > On Wed, Sep 3, 2014 at 6:27 PM, Henry Saputra <[email protected]> > wrote: > > > Thanks @Ufuk for the response. > > > > Yeah, Akka hides all the low level nuts and bolts about the RPC flow > > but then it also makes a bit harder to debug issues when communication > > fail. > > It makes sense to use one RPC framework if we could, and since there > > are other plans for Akka in the code to help manage concurrencies > > programming it is good idea to use Akka for RPC. > > > > - Henry > > > > > > On Wed, Sep 3, 2014 at 5:06 AM, Ufuk Celebi <[email protected]> wrote: > > > Hey Till, > > > > > > I'm not sure what the "right" ASF process is, but I wouldn't mind a > vote > > on > > > this in order to make sure that you don't do unnecessary work by > > replacing > > > the code with Scala. > > > > > > I for one would be certainly open to it. The only thing that bothers me > > is > > > the current state of out-of-the-box IDE support. But since there are > > other > > > successful Scala projects around ;-), which manage to do it, why > > shouldn't > > > we? > > > > > > @Henry, regarding Akka: I think the main motiviation for moving to Akka > > > (besides the points raised by Stephan and others) is that we actually > > don't > > > want to bother with low-level thread management, protocols, etc. > > > > > > > > > > > > On Tue, Sep 2, 2014 at 8:32 PM, Henry Saputra <[email protected] > > > > > wrote: > > > > > >> HI Till, > > >> > > >> Thanks for opening the discussions and lead the effort and apologize > > >> for late response. > > >> > > >> From what I have gathered so far, there are 2 issues: > > >> 1. Introducing Akka as RPC > > >> 2. Moving to Scala to enable easy access to Akka Scala APIs. > > >> > > >> For no1, if the RPC us used for lower level communications then we > > >> could probably consider Netty as the transport and serialization > > >> protocol (I also have added comment to the JIRA). > > >> Internally, to reduce thread management we could use Akka via Scala > > >> bridge service to make sure we use Scala Akka APIs. > > >> > > >> So addressing no 2, we could mix both Scala and Java in JobManager and > > >> TaskManager. the code that handle async RPC communications between JM > > >> and TM are using Java via Netty, and internal multi-threads or higher > > >> level plane code such as heart beat we could use Akka. > > >> > > >> It does introduce a bit mix between Java and Scala code but we already > > >> have mix of Scala and Java to support APIs so I think we could move > > >> some the internal code to use Scala too as "learning" steps to utilize > > >> Scala for better multi concurrency/ functional programming. > > >> > > >> - Henry > > >> > > >> > > >> > > >> On Sun, Aug 31, 2014 at 4:31 AM, Till Rohrmann < > [email protected] > > > > > >> wrote: > > >> > Hi Daniel, > > >> > > > >> > the RPC rework is discussed in > > >> > https://issues.apache.org/jira/browse/FLINK-1019. Jira is currently > > down > > >> > due to maintenance reasons. > > >> > > > >> > The ideas to use akka are the following. Akka allows us to reduce > the > > >> code > > >> > base which has to be maintained. Especially, we get rid of all the > > >> > multi-threading programming of the rpc service which is always hard > to > > >> work > > >> > with. With Akka we would get the heartbeat signal for free, because > > Akka > > >> > can detect dead actors. Akka uses supervision to handle fault > > tolerance > > >> as > > >> > well as recovery and it allows an easy forwarding of remote > > exceptions. > > >> At > > >> > the same time it offers a nice rpc abstraction which easily allows > to > > >> > implement asynchronous services. Furthermore, it scales rather well > to > > >> > large numbers of nodes and hopefully we get the latencies of Flink a > > >> little > > >> > bit down. > > >> > > > >> > Bests, > > >> > > > >> > Till > > >> > > > >> > > > >> > On Sun, Aug 31, 2014 at 11:35 AM, Daniel Warneke < > [email protected]> > > >> wrote: > > >> > > > >> >> Hi, > > >> >> > > >> >> will akka just be used for RPC or are there any plans to expand the > > >> >> actor-based model to further parts of the runtime system? If so, > > could > > >> you > > >> >> please point me to the discussion thread? > > >> >> > > >> >> Spontaneously, I would say that adding a hard dependency on Scala > > just > > >> for > > >> >> the sake of having a hip RPC service sounds like a pretty dodgy > deal. > > >> >> Therefore, I would like understand how much value akka could bring > to > > >> Flink > > >> >> in the long run. The discussion whether to reimplement core > > components > > >> of > > >> >> the system in Scala should be the second step in my opinion. > > >> >> > > >> >> Bests, > > >> >> > > >> >> Daniel > > >> >> > > >> >> > > >> >> Am 29.08.2014 11:33, schrieb Asterios Katsifodimos: > > >> >> > > >> >> I agree that using Akka's actors from Java results in very ugly > > code. > > >> >>> Hiding the internals of Akka behind Java reflection looks better > but > > >> >>> breaks > > >> >>> the principles of actors. For me it is kind of a deal breaker for > > using > > >> >>> Akka from Java. I think that Till has more reasons to believe > that > > >> Scala > > >> >>> would be a more appropriate for building a new Job/Task Manager. > > >> >>> > > >> >>> I think that this discussion should focus on 4 main aspects: > > >> >>> 1. Performance > > >> >>> 2. Implementability > > >> >>> 3. Maintainability > > >> >>> 4. Available Tools > > >> >>> > > >> >>> 1. Performance: Since that the job of the JobManager and the > > >> TaskManager > > >> >>> is > > >> >>> to 1) exchange messages in order to maintain a distributed state > > >> machine > > >> >>> and 2) setup connections between task managers, 3) detect failures > > >> etc.. > > >> >>> In > > >> >>> these basic operations, performance should not be an issue. Akka > was > > >> >>> proven > > >> >>> to scale quite well with very low latency. I guess that the low > > level > > >> >>> "plumbing" (serialization, connections, etc.) will continue in > Java > > in > > >> >>> order to guarantee high performance. I have no clue on what's > > happening > > >> >>> with memory management and whether this will be implemented in > Java > > or > > >> >>> Scala and the respective consequences. Please comment. > > >> >>> > > >> >>> 2. Since the Job/Task Manager is going to be essentially > implemented > > >> from > > >> >>> scratch, given the power of Akka, it seems to me that the > > >> implementation > > >> >>> will be easier, shorter and less verbose in Scala, given that > > Till is > > >> >>> comfortable enough with Scala. > > >> >>> > > >> >>> 3. Given #2, maintaining the code and trying out new ideas in > Scala > > >> would > > >> >>> take less time and effort. But maintaining low level plumbing in > > Java > > >> and > > >> >>> high level logic in Scala scares me. Anyone that has done this > > before > > >> >>> could > > >> >>> comment on this? > > >> >>> > > >> >>> 4. Tools: Robert has raised some issues already but I think that > > tools > > >> >>> will > > >> >>> get better with time. > > >> >>> > > >> >>> Given the above, I would focus on #3 to be honest. Apart from > this, > > >> going > > >> >>> the Scala way sounds like a great idea. I really second Kostas' > > opinion > > >> >>> that if large changes are going to happen, this is the best > moment. > > >> >>> > > >> >>> Cheers, > > >> >>> Asterios > > >> >>> > > >> >>> > > >> >>> > > >> >>> On Fri, Aug 29, 2014 at 1:02 AM, Till Rohrmann < > > >> [email protected]> > > >> >>> wrote: > > >> >>> > > >> >>> I also agree with Robert and Kostas that it has to be a community > > >> >>>> decision. > > >> >>>> I understand the problems with Eclipse and the Scala IDE which > is a > > >> pain > > >> >>>> in > > >> >>>> the ass. But eventually these things will be fixed. Maybe we > could > > >> also > > >> >>>> talk to the typesafe guy and tell him that this problem bothers > us > > a > > >> lot. > > >> >>>> > > >> >>>> I also believe that the project is not about a specific > programming > > >> >>>> language but a problem we want to tackle with Flink. From time to > > >> time it > > >> >>>> might be necessary to adapt the tools in order to reach the goal. > > In > > >> >>>> fact, > > >> >>>> I don't believe that Scala parts would drive people away from the > > >> >>>> project. > > >> >>>> Instead, Scala enthusiasts would be motivated to join us. > > >> >>>> > > >> >>>> Actually I stumbled across a quote of Leibniz which put's my > point > > of > > >> >>>> view > > >> >>>> quite accurately in a nutshell: > > >> >>>> > > >> >>>> In symbols one observes an advantage in discovery which is > greatest > > >> when > > >> >>>> they express the exact nature of a thing briefly and, as it were, > > >> picture > > >> >>>> it; then indeed the labor of thought is wonderfully diminished -- > > >> >>>> Gottfried > > >> >>>> Wilhelm Leibniz > > >> >>>> > > >> >>>> > > >> >>>> On Thu, Aug 28, 2014 at 12:57 PM, Kostas Tzoumas < > > [email protected] > > >> > > > >> >>>> wrote: > > >> >>>> > > >> >>>> On Thu, Aug 28, 2014 at 11:49 AM, Robert Metzger < > > >> [email protected]> > > >> >>>>> wrote: > > >> >>>>> > > >> >>>>> Changing the programming language of a very important system > > >> component > > >> >>>>>> > > >> >>>>> is > > >> >>>> > > >> >>>>> something we should carefully discuss. > > >> >>>>>> > > >> >>>>>> Definitely agree, I think the community should discuss this > very > > >> >>>>> > > >> >>>> carefully. > > >> >>>> > > >> >>>>> > > >> >>>>> I understand that Akka is written in Scala and that it will be > > much > > >> >>>>>> > > >> >>>>> more > > >> >>>> > > >> >>>>> natural to implement the actor based system using Scala. > > >> >>>>>> I see the following issues that we should consider: > > >> >>>>>> Until now, Flink is clearly a project implemented only in Java. > > The > > >> >>>>>> > > >> >>>>> Scala > > >> >>>> > > >> >>>>> API basically sits on top of the Java-based runtime. We do not > > really > > >> >>>>>> depend on Scala (we could easily remove the Scala API if we > want > > >> to). > > >> >>>>>> Having code written in Scala in the main system will add a hard > > >> >>>>>> > > >> >>>>> dependency > > >> >>>>> > > >> >>>>>> on a scala version. > > >> >>>>>> Being a pure Java project has some advantages: I think its a > fact > > >> that > > >> >>>>>> there are more Java programmers than Scala programmers. So our > > >> chances > > >> >>>>>> > > >> >>>>> of > > >> >>>> > > >> >>>>> attracting new contributors are higher when being a Java > project. > > >> >>>>>> On the other hand, we could maybe attract Scala developers to > our > > >> >>>>>> > > >> >>>>> project. > > >> >>>>> > > >> >>>>>> But that has not happened (for contributors, not users!) so far > > for > > >> our > > >> >>>>>> Scala API, so I don't see any reason for that to happen. > > >> >>>>>> > > >> >>>>>> > > >> >>>>>> This is definitely an issue to consider. We need to carefully > > >> weight > > >> >>>>> how > > >> >>>>> important this issue is. If we want to break things, incubation > is > > >> the > > >> >>>>> right time to do it. Below are some arguments in favor of > breaking > > >> >>>>> > > >> >>>> things, > > >> >>>> > > >> >>>>> but do keep in mind that I am undecided, and I would really like > > to > > >> see > > >> >>>>> > > >> >>>> the > > >> >>>> > > >> >>>>> community weighing in. > > >> >>>>> > > >> >>>>> First, I would dare say that the primary reason for someone to > > >> >>>>> contribute > > >> >>>>> to Flink so far has not been that the code is written in Java, > but > > >> more > > >> >>>>> > > >> >>>> the > > >> >>>> > > >> >>>>> content and nature of the project. Most contributors are Big > Data > > >> >>>>> enthusiasts in some way or another. > > >> >>>>> > > >> >>>>> Second, Scala projects have attracted contributors in the past. > > >> >>>>> > > >> >>>>> Third, it should not be too hard for someone that does not know > > >> Scala to > > >> >>>>> contribute to a different component if the interfaces are clear. > > >> >>>>> > > >> >>>>> > > >> >>>>> Another issue is tooling: There are a lot of problems with > Scala > > and > > >> >>>>>> Eclipse: I've recently switched to Eclipse Luna. It seems to be > > >> >>>>>> > > >> >>>>> impossible > > >> >>>>> > > >> >>>>>> to compile Scala code with Luna because ScalaIDE does not > > properly > > >> cope > > >> >>>>>> with it. > > >> >>>>>> Even with Eclipse versions that are supported by ScalaIDE, you > > have > > >> to > > >> >>>>>> manually install 3 plugins, some of them are not available in > the > > >> >>>>>> > > >> >>>>> Eclipse > > >> >>>> > > >> >>>>> Marketplace. So with a JobManager written in Scala, users can > not > > >> just > > >> >>>>>> import our project as a Maven project into Eclipse and start > > >> >>>>>> > > >> >>>>> developing. > > >> >>>> > > >> >>>>> The support for Maven is probably also limited. For example, I > > don't > > >> >>>>>> > > >> >>>>> know > > >> >>>> > > >> >>>>> if there is a checkstyle plugin for Scala. > > >> >>>>>> > > >> >>>>>> I'm looking forward to hearing other opinions on this issue. > As I > > >> said > > >> >>>>>> > > >> >>>>> in > > >> >>>> > > >> >>>>> the beginning, we should exchange arguments on this and think > > about > > >> it > > >> >>>>>> > > >> >>>>> for > > >> >>>>> > > >> >>>>>> some time before we decide on this. > > >> >>>>>> > > >> >>>>>> Best, > > >> >>>>> > > >> >>>>>> Robert > > >> >>>>>> > > >> >>>>>> > > >> >>>>>> > > >> >>>>>> On Thu, Aug 28, 2014 at 1:08 AM, Till Rohrmann < > > >> [email protected]> > > >> >>>>>> wrote: > > >> >>>>>> > > >> >>>>>> Hi guys, > > >> >>>>>>> > > >> >>>>>>> I currently working on replacing the old rpc infrastructure > > with an > > >> >>>>>>> > > >> >>>>>> akka > > >> >>>>> > > >> >>>>>> based actor system. In the wake of this change I will > reimplement > > >> the > > >> >>>>>>> JobManager and TaskManager which will then be actors. Akka > > offers a > > >> >>>>>>> > > >> >>>>>> Java > > >> >>>>> > > >> >>>>>> API but the implementation turns out to be very verbose and > > >> >>>>>>> > > >> >>>>>> laborious, > > >> >>>> > > >> >>>>> because Java 6 and 7 do not support lambdas and pattern > matching. > > >> >>>>>>> > > >> >>>>>> Using > > >> >>>> > > >> >>>>> Scala instead, would allow a far more succinct and clear > > >> >>>>>>> > > >> >>>>>> implementation > > >> >>>> > > >> >>>>> of > > >> >>>>>> > > >> >>>>>>> the JobManager and TaskManager. Instead of a lot of if > > statements > > >> >>>>>>> > > >> >>>>>> using > > >> >>>> > > >> >>>>> instanceof to figure out the message type, we could simply use > > >> >>>>>>> > > >> >>>>>> pattern > > >> >>>> > > >> >>>>> matching. Furthermore, the callback functions could simply be > > Scala's > > >> >>>>>>> anonymous functions. Therefore I would propose to use Scala > for > > >> these > > >> >>>>>>> > > >> >>>>>> two > > >> >>>>> > > >> >>>>>> systems. > > >> >>>>>>> > > >> >>>>>>> The Akka system uses the slf4j library as logging interface. > > >> >>>>>>> > > >> >>>>>> Therefore > > >> >>>> > > >> >>>>> I > > >> >>>>> > > >> >>>>>> would also propose to replace the jcl logging system with the > > slf4j > > >> >>>>>>> > > >> >>>>>> logging > > >> >>>>>> > > >> >>>>>>> system. Since we want to use Akka in many parts of the runtime > > >> system > > >> >>>>>>> > > >> >>>>>> and > > >> >>>>> > > >> >>>>>> it recommends using logback as logging backend, I would also > > like to > > >> >>>>>>> replace log4j with logback. But this change should inflict > only > > few > > >> >>>>>>> > > >> >>>>>> changes > > >> >>>>>> > > >> >>>>>>> once we established the slf4j logging interface everywhere. > > >> >>>>>>> > > >> >>>>>>> What do you guys think of that idea? > > >> >>>>>>> > > >> >>>>>>> Best regards, > > >> >>>>>>> > > >> >>>>>>> Till > > >> >>>>>>> > > >> >>>>>>> > > >> >> > > >> > > >
