Hi,
I have read about Mesos [1], Yarn [2] and Myriad, but I couldn't find an
explicit answer to a few general questions. First of all, I don't have an
actual cluster with a business usecase to solve, but I'm interested in the
technologies and motivation behind these systems.
>From my understanding Myriad is a Mesos Framework (just like Marathon, Spark,
>...) which acts as a "wrapper" around Yarn. This enables a dynamic coexistence
>of Yarn and Mesos on the same cluster which was originally not possible.
However, from a theoretical standpoint, Yarn and Mesos appear to be - in
general - only different variations of the same thing: Resource Negotiators in
a cluster environment.
This leads to the first question:
(1) Why would you want to run Mesos and Yarn together?
What would be the disadvantages of choosing only one of them?
One valid argument might be that there are Mesos Frameworks / Yarn Applications
which you don't want to port to Yarn / Mesos and vice versa. Myriad would allow
you to use Mesos (and all frameworks built for it), but still use all Yarn
applications.
Nevertheless, in many cases I would suspect that even though there surely are
interesting Yarn applications, the most prominent example is MapReduce.
However, MapReduce v1 has been ported to a Mesos Framework [1, 3] several years
ago.
This leads to the second question:
(2) What are the advantages of running MapReduce v2 using Yarn via Myriad on a
Mesos Cluster instead of running the MapReduce v1 framework directly on Mesos?
One might argue that the first option sounds like more overhead, but as
MapReduce is typically batch oriented this argument might not stand too well.
Due to the different strategies of Mesos (offer oriented) and Yarn (request
oriented), one question regarding applications which require data locality
(e.g. MapReduce) pops up:
(3) How can Myriad provide good data locality for applications with high
dependence on data locality?
As the underlying Mesos system negotiates resources via offers, it seems that a
framework has few possibilities aside from waiting for matching offers. Is this
the strategy Myriad employs?
And this leads to my final question:
(4) How does Yarn via Myriad on Mesos compare to Yarn "alone"?
Have there been studies about Myriad, potentially with such evaluations, yet?
I'm grateful for any input, Thank you very much!
Cheers,
Dave
[1] http://static.usenix.org/events/nsdi11/tech/full_papers/Hindman_new.pdf
[2]
https://www.sics.se/~amir/files/download/dic/2013%20-%20Apache%20Hadoop%20YARN:%20Yet%20Another%20Resource%20Negotiator%20(SoCC).pdf
[3] http://myriad.incubator.apache.org/
[4] https://github.com/mesos/hadoop