Porting YARN to run atop Mesos is quite reasonable.  Some folks at eBay
have started some work on this (https://github.com/mesos/myriad).  If
you're interested, you should check it out, and contribute to the project.

On Tue, Oct 28, 2014 at 5:21 AM, Yaneeve Shekel <[email protected]>
wrote:

>  To quote John below,
>
> “So excuse my naivety… but…”, I am also confused as to the version/naming 
> convention going on at the hadoop project.
>
> I would like to run hadoop over mesos as opposed to over yarn. I would also 
> like to use the *“new”* mapreduce packages.
>
> https://github.com/mesos/hadoop mentions that “The pom.xml included is 
> configured and tested against CDH5 and MRv1. Hadoop on Mesos does not 
> currently support YARN (and MRv2).”  Does this all mean that the mapreduce 
> package is not available. I think it does not, I think I should be able to 
> use the “new” api over any scheduling system just as I could over plain 
> vanilla cdh (where I could configure and use any combination of the the cross 
> product -> (mapred, mapreduce) X (MRv1, YARN)). Could anyone verify this?
>
> Second, has any work been done as pertaining the original thread with regards 
> to what John has suggested below?
>
>
>
> Thanks a lot,
>
> Yaneeve
>
>
>
> On Jul 27, 2014 7:00 PM, "John Omernik" <[email protected]> wrote:
>
>
>
> > So excuse my naivety in this space, but my ignorance has never really
>
> > stopped me from asking questions:
>
> >
>
> > I see YARN (Yet another resource negotiator) as very similar to Mesos.
>
> > I.e. something to manage resources on a cluster of machines. So when I hear
>
> > talk of running "YARN" on Mesos it's seems very redundant indeed, and I ask
>
> > myself, what are we actually getting out of this setup?
>
> >
>
> > So, going to the mapr/reduce question, I see Mapr Reduce V1 and MaprReduce
>
> > V2 like this:  Map Reduce V2 is an application that runs on YARN. I.e. if
>
> > you run a job, it creates an application master, that application master
>
> > requests resources, and the job gets run.  It differs from Map Reduce V1 is
>
> > there is no long running Job Tracker (other than the YARN Resource Manager,
>
> > but that is managing resources for all applications, not just Map Reduce
>
> > Applications).  Ok, so Mesos, why can't there be a Mesos Application that
>
> > is similar to a Map Reduce V2 Application in YARN?  Why do we need to run
>
> > YARN on Mesos? That doesn't really make sense.  Basically, for M/R V2 vs
>
> > M/R V1, the only difference is to mimic M/R V1 we need task trackers and
>
> > job trackers running as Mesos applications (which we have).  So in M/R v2,
>
> > we just need the equivalent of an application master running on Yarn,
>
> > requesting resources across the cluster.
>
> >
>
> > Fundamentally, YARN is confusing because I think they coupled running Map
>
> > Reduce jobs with the resource manager and called it "Hadoop v2".  By
>
> > coupling the two, people look at YARN as Map Reduce V2, but it's not
>
> > really.  It's a way to running jobs on a cluster of machines (ala Mesos)
>
> > with a "application" that is the equivalent of Map Reduce V1.   The names
>
> > being given seem to be confusing to me, it makes people who have invested
>
> > in Hadoop (Map Reduce V1) be very interested in YARN because it's called
>
> > "Hadoop V2".  While Mesos is seen as the "Other"
>
> >
>
> >
>
> > Just for my sake I summarized a TL;DR form so if someone wants to correct
>
> > my understanding they can
>
> >
>
> > Mesos = Tool to manage resources
>
> >
>
> > YARN = Tool to manage resources it's also called Hadoopv2
>
> >
>
> > Map Reduce V1 = Job trackers/Task Trackers it's what we know. It can run
>
> > on Hadoop clusters, and Mesos.  It's also called Hadoopv1
>
> >
>
> > Map Reduce V2 =  Application that can run on YARN that mimics Map Reduce
>
> > V1 on a YARN Cluster. This + YARN has been called Hadoopv2.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > On Sun, Jul 27, 2014 at 4:10 AM, Maxime Brugidou <
>
> > [email protected]> wrote:
>
> >
>
> >> When I said that running yarn over mesos did not make sense I meant that
>
> >> running a resource manager in a resource manager was very sub-optimal. You
>
> >> will eventually do static allocation of resources for the Yarn framework in
>
> >> Mesos or have complex logic to determine how much resource should be given
>
> >> to yarn. You will also have the same burden of managing 2 different
>
> >> clusters instead of one, even if yarn is sort of hidden as mesos framework.
>
> >>
>
> >> However yes I believe its easier to run yarn on mesos than to run mrv2 on
>
> >> top of mesos. The solution I was discussing was obviously "ideal" and I
>
> >> looked at the MRAppMaster since and it discouraged me :)
>
> >>  On Jul 27, 2014 12:41 AM, "Rick Richardson" <[email protected]>
>
> >> wrote:
>
> >>
>
> >>> FWIW I also think the fastest approach here is is porting Yarn onto
>
> >>> Mesos.
>
> >>>
>
> >>> In a perfect world, writing an implementation layer for the Yarn
>
> >>> Interface on Mesos would certainly be the optimal approach, but looking at
>
> >>> the MRv2 code, it is very very coupled to many Yarn modules.
>
> >>>
>
> >>> If someone wanted to take on the project of making a generic resource
>
> >>> scheduler Interface for MRv2, that works be amazing :)
>
> >>> On Jul 26, 2014 6:19 PM, "Jie Yu" <[email protected]> wrote:
>
> >>>
>
> >>>> I am interested in investigating the idea of YARN on top of Mesos. One
>
> >>>> of the benefits I can think of is that we can get rid of the static
>
> >>>> resource allocation between YARN and Mesos clusters. In that way, Mesos 
> >>>> can
>
> >>>> allocate those resources that are not used by YARN to other Mesos
>
> >>>> frameworks like Aurora, Marathon, etc, to increase the resource 
> >>>> utilization
>
> >>>> of the entire data center. Also, we could avoid running each MRv2 job as 
> >>>> a
>
> >>>> framework which I think might cause some maintenance complexity (e.g. for
>
> >>>> framework rate limiting, etc). Finally, YARN currently does not have a 
> >>>> good
>
> >>>> isolation support. It only supports cpu isolation right now (using
>
> >>>> cgroups). By porting YARN on top of Mesos, we might be able to leverage 
> >>>> the
>
> >>>> existing Mesos containerizer strategy to provide better isolation between
>
> >>>> tasks. Maxime, I am curious why do you think it does not make sense to 
> >>>> run
>
> >>>> YARN over Mesos? Since I am not super familar with YARN, I might be 
> >>>> missing
>
> >>>> something.
>
> >>>>
>
> >>>> I have been thinking of making ResourceManager in YARN a Mesos
>
> >>>> framework and making NodeManager a Mesos executor. The NodeManager will
>
> >>>> launch containers using primitives provided by Mesos so that we have a
>
> >>>> consistent containerizer layer. I haven't fully figured out how this 
> >>>> could
>
> >>>> be done yet (e.g., nested containers, communication between NodeManager 
> >>>> and
>
> >>>> ResourceManager, etc.), but I would love to explore this direction. I 
> >>>> would
>
> >>>> like to hear about any feedback/suggestions you guys have about this
>
> >>>> direction.
>
> >>>>
>
> >>>> Thanks,
>
> >>>> - Jie
>
> >>>>
>
> >>>>
>
> >>>> On Fri, Jul 25, 2014 at 1:39 PM, Maxime Brugidou <
>
> >>>> [email protected]> wrote:
>
> >>>>
>
> >>>>> We run both mesos and yarn in prod and it does not make sense to run
>
> >>>>> yarn over mesos.
>
> >>>>>
>
> >>>>> However it would be interesting to find a way to run MRv2 jobs on
>
> >>>>> mesos with some custom layer to swap yarn with mesos. Not sure how to
>
> >>>>> start
>
> >>>>> though... MRv2 contains a yarn application master that needs to be
>
> >>>>> rewritten as a mesos framework scheduler. This is probably doable. 
> >>>>> However
>
> >>>>> with MRv2 every map reduce job would be mapped as a new framework in
>
> >>>>> Mesos.
>
> >>>>> Not sure how many frameworks mesos can run and scale up to. Especially
>
> >>>>> short lived frameworks.
>
> >>>>>  On Jul 25, 2014 8:54 PM, "Tom Arnfeld" <[email protected]> wrote:
>
> >>>>>
>
> >>>>>> Hey Luyi,
>
> >>>>>>
>
> >>>>>> That's correct, the Hadoop framework currently only supports Hadoop 2
>
> >>>>>> MRv1. It also doesn't have great support for the HA jobtracker 
> >>>>>> available
>
> >>>>>> in
>
> >>>>>> newer versions of Hadoop, but I've been working on that the past few
>
> >>>>>> weeks.
>
> >>>>>>
>
> >>>>>> I'm not sure how Hadoop 2 would play with Mesos, but very interested
>
> >>>>>> to find out more. Am I correct in thinking MRv2 will only run on top of
>
> >>>>>> YARN?
>
> >>>>>>
>
> >>>>>> I wonder if anyone else on the mailing list is running YARN on top of
>
> >>>>>> Mesos...
>
> >>>>>>
>
> >>>>>> Tom.
>
> >>>>>>
>
> >>>>>> On Friday, 25 July 2014, Luyi Wang <[email protected]> wrote:
>
> >>>>>>
>
> >>>>>>> Checked the mesos github(https://github.com/mesos/hadoop). It
>
> >>>>>>> listed support for MapReduce V1
>
> >>>>>>>
>
> >>>>>>> How about the MR V2?
>
> >>>>>>>
>
> >>>>>>> Right now we are using cloudera to manage hadoop clusters where uses
>
> >>>>>>> MRV2. We are planning to migrate all our services to mesos(still in 
> >>>>>>> the
>
> >>>>>>> initial investigating stage).  Good suggestions, advice and 
> >>>>>>> experiences
>
> >>>>>>> are
>
> >>>>>>> welcomed.
>
> >>>>>>>
>
> >>>>>>> Thanks a lot!
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>> -Luyi.
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>
>
> >
>
>
>

Reply via email to