I second john's opinion on the confusing part of different terminology of
hadoop v2.  That's the reason I asked the question on if mesos support mr
v2.  As maxime's concern, the decoupling part might be difficult.  After
reading the mesos mrv1's implementation, I think possibly mrv2 migration
can be done as if not touching anything related with resource manger(Yarn).
    Need more time to investigating more on this complication.



-Luyi.





On Sun, Jul 27, 2014 at 10:40 AM, Maxime Brugidou <maxime.brugi...@gmail.com
> wrote:

> John, i believe that you are 100% correct. Theoretically we should run
> MRv2 on Mesos but the current implementation of MRv2 on Yarn seem very
> complex and difficult to decouple from the resource manager/negotiator.
>
> It's still something that could be done I guess but maybe as completely
> independent Hadoop-compatible map reduce framework for Mesos. You could
> write this from scratch with a custom framework inspired by the MRv2 app
> master implementation.
>  On Jul 27, 2014 7:00 PM, "John Omernik" <j...@omernik.com> wrote:
>
>> So excuse my naivety in this space, but my ignorance has never really
>> stopped me from asking questions:
>>
>> I see YARN (Yet another resource negotiator) as very similar to Mesos.
>> I.e. something to manage resources on a cluster of machines. So when I hear
>> talk of running "YARN" on Mesos it's seems very redundant indeed, and I ask
>> myself, what are we actually getting out of this setup?
>>
>> So, going to the mapr/reduce question, I see Mapr Reduce V1 and
>> MaprReduce V2 like this:  Map Reduce V2 is an application that runs on
>> YARN. I.e. if you run a job, it creates an application master, that
>> application master requests resources, and the job gets run.  It differs
>> from Map Reduce V1 is there is no long running Job Tracker (other than the
>> YARN Resource Manager, but that is managing resources for all applications,
>> not just Map Reduce Applications).  Ok, so Mesos, why can't there be a
>> Mesos Application that is similar to a Map Reduce V2 Application in YARN?
>>  Why do we need to run YARN on Mesos? That doesn't really make sense.
>>  Basically, for M/R V2 vs M/R V1, the only difference is to mimic M/R V1 we
>> need task trackers and job trackers running as Mesos applications (which we
>> have).  So in M/R v2, we just need the equivalent of an application master
>> running on Yarn, requesting resources across the cluster.
>>
>> Fundamentally, YARN is confusing because I think they coupled running Map
>> Reduce jobs with the resource manager and called it "Hadoop v2".  By
>> coupling the two, people look at YARN as Map Reduce V2, but it's not
>> really.  It's a way to running jobs on a cluster of machines (ala Mesos)
>> with a "application" that is the equivalent of Map Reduce V1.   The names
>> being given seem to be confusing to me, it makes people who have invested
>> in Hadoop (Map Reduce V1) be very interested in YARN because it's called
>> "Hadoop V2".  While Mesos is seen as the "Other"
>>
>>
>> Just for my sake I summarized a TL;DR form so if someone wants to correct
>> my understanding they can
>>
>> Mesos = Tool to manage resources
>>
>> YARN = Tool to manage resources it's also called Hadoopv2
>>
>> Map Reduce V1 = Job trackers/Task Trackers it's what we know. It can run
>> on Hadoop clusters, and Mesos.  It's also called Hadoopv1
>>
>> Map Reduce V2 =  Application that can run on YARN that mimics Map Reduce
>> V1 on a YARN Cluster. This + YARN has been called Hadoopv2.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Jul 27, 2014 at 4:10 AM, Maxime Brugidou <
>> maxime.brugi...@gmail.com> wrote:
>>
>>> When I said that running yarn over mesos did not make sense I meant that
>>> running a resource manager in a resource manager was very sub-optimal. You
>>> will eventually do static allocation of resources for the Yarn framework in
>>> Mesos or have complex logic to determine how much resource should be given
>>> to yarn. You will also have the same burden of managing 2 different
>>> clusters instead of one, even if yarn is sort of hidden as mesos framework.
>>>
>>> However yes I believe its easier to run yarn on mesos than to run mrv2
>>> on top of mesos. The solution I was discussing was obviously "ideal" and I
>>> looked at the MRAppMaster since and it discouraged me :)
>>>  On Jul 27, 2014 12:41 AM, "Rick Richardson" <rick.richard...@gmail.com>
>>> wrote:
>>>
>>>> FWIW I also think the fastest approach here is is porting Yarn onto
>>>> Mesos.
>>>>
>>>> In a perfect world, writing an implementation layer for the Yarn
>>>> Interface on Mesos would certainly be the optimal approach, but looking at
>>>> the MRv2 code, it is very very coupled to many Yarn modules.
>>>>
>>>> If someone wanted to take on the project of making a generic resource
>>>> scheduler Interface for MRv2, that works be amazing :)
>>>> On Jul 26, 2014 6:19 PM, "Jie Yu" <yujie....@gmail.com> wrote:
>>>>
>>>>> I am interested in investigating the idea of YARN on top of Mesos. One
>>>>> of the benefits I can think of is that we can get rid of the static
>>>>> resource allocation between YARN and Mesos clusters. In that way, Mesos 
>>>>> can
>>>>> allocate those resources that are not used by YARN to other Mesos
>>>>> frameworks like Aurora, Marathon, etc, to increase the resource 
>>>>> utilization
>>>>> of the entire data center. Also, we could avoid running each MRv2 job as a
>>>>> framework which I think might cause some maintenance complexity (e.g. for
>>>>> framework rate limiting, etc). Finally, YARN currently does not have a 
>>>>> good
>>>>> isolation support. It only supports cpu isolation right now (using
>>>>> cgroups). By porting YARN on top of Mesos, we might be able to leverage 
>>>>> the
>>>>> existing Mesos containerizer strategy to provide better isolation between
>>>>> tasks. Maxime, I am curious why do you think it does not make sense to run
>>>>> YARN over Mesos? Since I am not super familar with YARN, I might be 
>>>>> missing
>>>>> something.
>>>>>
>>>>> I have been thinking of making ResourceManager in YARN a Mesos
>>>>> framework and making NodeManager a Mesos executor. The NodeManager will
>>>>> launch containers using primitives provided by Mesos so that we have a
>>>>> consistent containerizer layer. I haven't fully figured out how this could
>>>>> be done yet (e.g., nested containers, communication between NodeManager 
>>>>> and
>>>>> ResourceManager, etc.), but I would love to explore this direction. I 
>>>>> would
>>>>> like to hear about any feedback/suggestions you guys have about this
>>>>> direction.
>>>>>
>>>>> Thanks,
>>>>> - Jie
>>>>>
>>>>>
>>>>> On Fri, Jul 25, 2014 at 1:39 PM, Maxime Brugidou <
>>>>> maxime.brugi...@gmail.com> wrote:
>>>>>
>>>>>> We run both mesos and yarn in prod and it does not make sense to run
>>>>>> yarn over mesos.
>>>>>>
>>>>>> However it would be interesting to find a way to run MRv2 jobs on
>>>>>> mesos with some custom layer to swap yarn with mesos. Not sure how to 
>>>>>> start
>>>>>> though... MRv2 contains a yarn application master that needs to be
>>>>>> rewritten as a mesos framework scheduler. This is probably doable. 
>>>>>> However
>>>>>> with MRv2 every map reduce job would be mapped as a new framework in 
>>>>>> Mesos.
>>>>>> Not sure how many frameworks mesos can run and scale up to. Especially
>>>>>> short lived frameworks.
>>>>>>  On Jul 25, 2014 8:54 PM, "Tom Arnfeld" <t...@duedil.com> wrote:
>>>>>>
>>>>>>> Hey Luyi,
>>>>>>>
>>>>>>> That's correct, the Hadoop framework currently only supports Hadoop
>>>>>>> 2 MRv1. It also doesn't have great support for the HA jobtracker 
>>>>>>> available
>>>>>>> in newer versions of Hadoop, but I've been working on that the past few
>>>>>>> weeks.
>>>>>>>
>>>>>>> I'm not sure how Hadoop 2 would play with Mesos, but very interested
>>>>>>> to find out more. Am I correct in thinking MRv2 will only run on top of
>>>>>>> YARN?
>>>>>>>
>>>>>>> I wonder if anyone else on the mailing list is running YARN on top
>>>>>>> of Mesos...
>>>>>>>
>>>>>>> Tom.
>>>>>>>
>>>>>>> On Friday, 25 July 2014, Luyi Wang <wangluyi1...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Checked the mesos github(https://github.com/mesos/hadoop). It
>>>>>>>> listed support for MapReduce V1
>>>>>>>>
>>>>>>>> How about the MR V2?
>>>>>>>>
>>>>>>>> Right now we are using cloudera to manage hadoop clusters where
>>>>>>>> uses MRV2. We are planning to migrate all our services to mesos(still 
>>>>>>>> in
>>>>>>>> the initial investigating stage).  Good suggestions, advice and 
>>>>>>>> experiences
>>>>>>>> are welcomed.
>>>>>>>>
>>>>>>>> Thanks a lot!
>>>>>>>>
>>>>>>>>
>>>>>>>> -Luyi.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>

Reply via email to