Here's another silly question.

Mesos plans to add HNG? or will be supported only pure Map/Reduce?

On Fri, Jul 1, 2011 at 2:15 PM, Ted Dunning <[email protected]> wrote:
> Also, both projects are changing in terms of what they do and what they
> intend to do.
>
> For instance, support for long running processes and alternative execution
> models other than map-reduce is an explicit goal for Yarn.
>
> This illustrates how hard it is for anybody to compare systems.  Typically,
> any given person knows much more about one system than the other leading to
> many comparison points that are only half true (that half being the one with
> better information).  This isn't remediable without collaborative discussion
> between (differently) informed speakers.
>
>
> On Thu, Jun 30, 2011 at 10:10 PM, Edward J. Yoon <[email protected]>wrote:
>
>> Understood.
>>
>> On Fri, Jul 1, 2011 at 1:59 PM, Matei Zaharia <[email protected]>
>> wrote:
>> > I wouldn't say it's designed for Yahoo! only, but it's definitely meant
>> to solve issues they saw with large Hadoop clusters (and provides a lot of
>> value for that).
>> >
>> > Matei
>> >
>> > On Jul 1, 2011, at 12:51 AM, Edward J. Yoon wrote:
>> >
>> >> Hmm, HNG seems designed for their (Y!) own circumstance.
>> >>
>> >> On Fri, Jul 1, 2011 at 12:47 PM, Matei Zaharia <[email protected]>
>> wrote:
>> >>> Ted brought up some superficial differences, but if you want to
>> understand technical differences, there are a bunch of those as well. Mesos
>> and Hadoop next-gen have similar goals (more efficient resource sharing for
>> data centers), but they are coming at it from different angles -- HNG is
>> currently mainly focusing on MapReduce and aims to support other types of
>> applications too, while Mesos was meant to support a very diverse set of
>> applications, including long-running services and batch jobs (rather than
>> only multiple instances of MapReduce), and is in fact being used for that
>> already. More importantly, HNG is really two pieces -- a refactoring of
>> MapReduce to allow one instance of MR per application, and a resource
>> manager called YARN that lets these instances coordinate. We are going to
>> support having the new MR2 application masters run on top of Mesos instead
>> of YARN too (and indeed the refactoring is nice because it will enable
>> Hadoop MapReduce to run on other cluster scheduling systems in the future).
>> >>>
>> >>> In terms of the technical differences, here are some of the main ones
>> currently:
>> >>>
>> >>> - Mesos is implemented in C++ rather than Java, and has APIs in C++ and
>> Python in addition to Java.
>> >>>
>> >>> - The resource allocation models are different: HNG has a central
>> scheduler that supports data locality constraints, while Mesos provides
>> "resource offers" to let applications pick the resources they like according
>> to other criteria in addition to requests/filters to describe which
>> resources you want to be offered. Our belief is that resource offers will
>> allow Mesos to support a wider range of application scheduling needs, while
>> simultaneously making the system more scalable and highly available
>> (minimizing the state and work required of the master).
>> >>>
>> >>> - Mesos can enforce resource isolation through Linux Containers to
>> guard against misbehaving / greedy tasks.
>> >>>
>> >>> - HNG supports Kerberos authentication for users.
>> >>>
>> >>> - HNG can run the MR2 version of Hadoop, while Mesos can run Hadoop
>> 0.20, Spark and MPI.
>> >>>
>> >>> - There are some smaller architectural differences that may matter for
>> some applications, such as communication being based on message-passing in
>> Mesos vs periodic heartbeats in HNG, which allows Mesos to provide lower
>> scheduling latencies (e.g. to still be efficient if your tasks take 100ms
>> each).
>> >>>
>> >>> However, overall, as Ted said, many of these differences will likely go
>> away as both projects add features. What will be interesting is whether some
>> fundamental differences in the target workloads remain, which I think is
>> likely to happen. For example, the main deployment of Mesos is currently to
>> run long-running stream processing services at Twitter, which is something
>> that typical Hadoop environments just don't do and that requires different
>> things from the cluster scheduler. I also believe we're going to see a lot
>> of other cluster scheduling systems besides Mesos and HNG in the future, as
>> people's requirements for these systems grow. There are some very
>> challenging problems in designing a general cluster scheduling system that
>> even the Google folks are still working hard on.
>> >>>
>> >>> Matei
>> >>>
>> >>>
>> >>>
>> >>> On Jun 30, 2011, at 6:26 PM, Edward J. Yoon wrote:
>> >>>
>> >>>> Thanks for your nice and quick explanation!
>> >>>>
>> >>>> On Fri, Jul 1, 2011 at 10:21 AM, Ted Dunning <[email protected]>
>> wrote:
>> >>>>> Technically speaking, Mesos has a less expressive model for
>> expressing
>> >>>>> resource requirements.  The thesis of Mesos is that the negotiation
>> between
>> >>>>> application and scheduler can make up for this missing information.
>>  Mesos
>> >>>>> was also first to "market", but Hadoop nextGen is catching up fast.
>>  The
>> >>>>> MR-279 has code that works, albeit with some issues in production
>> use.  From
>> >>>>> all reports, these issues are being resolved quickly as Yahoo's
>> considerable
>> >>>>> QA resources come to bear.
>> >>>>>
>> >>>>> Politically speaking, Mesos has a nearly inactive mailing list which,
>> to
>> >>>>> outward appearances, indicate a nearly inactive project.  There is
>> some
>> >>>>> evidence that considerable activity is occurring off-list, but this
>> is a
>> >>>>> process bug in the Apache model since "if it doesn't happen on the
>> list, it
>> >>>>> doesn't happen".
>> >>>>>
>> >>>>> On the other side, Hadoop nextGen has the Hadoop community pretty
>> much
>> >>>>> behind it.  Since HNG has the potential to breakdown some of the
>> deadlocks
>> >>>>> that have plagued the Hadoop community release process, there is
>> >>>>> considerable enthusiasm for it.
>> >>>>>
>> >>>>> Combined, these factors make it much more likely that HNG will be the
>> >>>>> dominant force in the Hadoop world.  That is, more likely in my own
>> >>>>> estimation.  Others may differ.
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Jun 30, 2011 at 5:16 PM, Edward J. Yoon <
>> [email protected]>wrote:
>> >>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> I'm newbie, and wonder what's the main differences between Hadoop
>> >>>>>> nextGen and Mesos.
>> >>>>>>
>> >>>>>> Thanks.
>> >>>>>> --
>> >>>>>> Best Regards, Edward J. Yoon
>> >>>>>> @eddieyoon
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Best Regards, Edward J. Yoon
>> >>>> @eddieyoon
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> @eddieyoon
>> >
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to