In case folks are not on the other lists. I saw this and figured they may be further interest.
Cheers, Tim ----- Forwarded Message ----- > From: "Vinod Kumar Vavilapalli" <[email protected]> > To: [email protected] > Cc: "mapreduce-dev" <[email protected]> > Sent: Wednesday, July 31, 2013 11:45:31 AM > Subject: Re: Abstraction layer to support both YARN and Mesos > > > What I thought was the original proposal was to use the existing MR > client+AM+task code to run on top of Mesos. And like Steve mentioned, today > all of it is very tightly couple with YARN APIs. Using JobClient against a > Mesos implementation of MapReduce is easy, changing AM to start getting > containers from Mesos and launching via Mesos needs more abstractions. And > at this point of time, again as Steve laid it out clearly, the focus of > MapReduce project is on stabilizing and shipping together with YARN. > > That said, working on thinking about those abstractions inside MR AM is a > step forward IF there is enough interest around this. I see a couple of > people already showing enthusiasm, but it'll be great to see more interest. > May be a few from Mesos community who understand what those abstractions > should look like. > > The last thing we want is create unnecessary abstractions now that may never > get used in the future. > > Thanks, > +Vinod > > On Jul 31, 2013, at 9:34 AM, Bikas Saha wrote: > > > +1 for Tom's suggestion. That is how we have transparently redirected MR > > jobs to use Tez as the execution framework. > > > > Bikas > > > > -----Original Message----- > > From: Tom White [mailto:[email protected]] > > Sent: Wednesday, July 31, 2013 8:41 AM > > To: mapreduce-dev > > Cc: [email protected] > > Subject: Re: Abstraction layer to support both YARN and Mesos > > > > I can see value in this, since it would allow MR programs and libraries to > > run on either YARN or Mesos with no recompilation. The value here is > > really in the libraries since it means library maintainers don't have to > > maintain two versions of their library. > > > > Note that there is no extra level of indirection required - it's already > > there in org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider - > > which is used to switch between submitting jobs to the JobTracker and > > submitting to YARN's RM. A MesosClientProtocolProvider might be hosted in > > Mesos - perhaps Mesos developers are already working on this? > > > > Cheers, > > Tom > > > > On Wed, Jul 31, 2013 at 4:30 PM, Steve Loughran <[email protected]> > > wrote: > >> On 26 July 2013 07:13, Tsuyoshi OZAWA <[email protected]> wrote: > >> > >>> Hi, > >>> > >>> Now, Apache Mesos, an distributed resource manager, is top-level > >>> apache project. Meanwhile, As you know, Hadoop has own resource > >>> manager - YARN. IMHO, we should make resource manager pluggable in > >>> MRv2, because there are their own field users of MapReduce would like > >>> to use. I think this work is useful for MapReduce users. On the other > >>> hand, this work can also be large, because MRv2's code base is > >>> tightly coupled with YARN currently. Thoughts? > >>> > >> > >> MRv2 is too intimately involved with Hadoop for it to easily be moved, > >> have a look at the mapreduce package code base to see this. We are > >> also developing and currently releasing them in sync. > >> > >> Yes, an extra layer of indirection may appear to get MR to work on > >> Mesos -but things like locality, ongoing dev YARN APIs &c and the > >> release schedule would push for MRv2 to focus on YARN: data aware job > >> (and service) scheduling in Hadoop clusters. > >> > >> As an example of how those layers of indirection cause problems, look > >> at commons-logging. Ubiquitous as the API in front of Log4J, when > >> using raw Log4J would have been better (look in the hadoop tests code > >> where the underlying logger is explicitly extracted and tuned for > > examples). > >
