Thank you Ruth!! Love that you are helping with that!

Sent from my iPhone

> On Sep 25, 2015, at 11:23 AM, Ruth Harris <[email protected]> wrote:
> 
> Okee dokey!
> 
> I'll read through and pull John and Ken's descriptions into the wiki.
> 
> Thank John and Ken!!
> 
> --ruth
> 
>> On Fri, Sep 25, 2015 at 8:48 AM, Jim Klucar <[email protected]> wrote:
>> 
>> That is a great reply, thanks. We should cut/paste it into the wiki.
>> 
>>> On Fri, Sep 25, 2015 at 10:23 AM, Ken Sipe <[email protected]> wrote:
>>> 
>>> that is my list as well… the bullet points
>>> 
>>> * multi-versions of hadoop in the same cluster (we aren’t there yet)
>>> * scale down v1 of hadoop as you scale up v2 (completely different way of
>>> “decommissioning” services)
>>> * co-located services and data
>>> * multi tenant (manage hadoop, spark, kubernetes and other mesos services
>>> with 1 view into the resource / capacity utilization)
>>> * scale up yarn dynamically to utilities dc resources during off peak
>>> availability (imagine how awesome this will be after over provisioning is
>>> in place)
>>> 
>>> Ken
>>> 
>>>> On Sep 25, 2015, at 9:13 AM, John Omernik <[email protected]> wrote:
>>>> 
>>>> "Why would you want to do that?"
>>>> 
>>>> As a potential user of Myriad, in the enterprise I see a number of
>>> reasons
>>>> I'd "want to do that" they are:
>>>> 
>>>> - The ability to use Mesos' purpose built and well design resource
>>>> management with Map Reduce. Right now Yarn is is the only option to run
>>> Map
>>>> Reduce V2 Applications, and while Yarn is far superior to Resource
>>>> Management in Map Reduce V1, we have still have an important
>> application
>>>> that is intrinsically tied to the resource schedule. Things that run on
>>>> resource schedulers should not be tied to them. Map Reduce V2 should
>> not
>>>> have a specific resource scheduler as a requirement.
>>>> 
>>>> - Multi Tenancy: Right now if you have a cluster of computers, you can
>>> run
>>>> one Yarn cluster on them.  With Myriad, the option exists to have
>> smaller
>>>> clusters, that are purpose built running on one set of harder, think a
>>> Yarn
>>>> cluster for marketing, or one for HR.  This is great option for better
>>>> utilizing your resources, as well as better scaling growth and costs
>>>> associated with growth. Consider setting up separate clusters in Yarn
>>>> without Mesos: Many services duplicated, VMs or Physical node
>> management
>>>> issues, etc.
>>>> 
>>>> - To build on Multi Tenancy, consider different version of Yarn and Map
>>>> Reduce. Right now, a new feature or bug fix comes out in a version of
>>> Yarn,
>>>> and there is not a good way to put that into play with your data. You
>>> have
>>>> to go through horrible testing process just to upgrade, and you have to
>>>> make sure ALL other jobs are not affected by the upgrade. With Myriad,
>>> keep
>>>> your production jobs at version X of yarn, and then spin up a new Yarn
>>>> cluster at version x+1.  Now you can test your jobs slowly, and
>> migrated
>>>> them one by one without impact to production processes.  Upgrading is
>> now
>>>> not all or nothing, but a controlled process where you can "fail fast"
>>> i.e.
>>>> if the job doesn't work, roll it back to the older version of Yarn.
>>>> 
>>>> - The ability to have applications (think Docker containers) sitting
>>> right
>>>> next to the data (Hadoop data) they may be interacting with. Monitoring
>>> all
>>>> the jobs in one place rather than distinct clusters for containers and
>>>> others for data frameworks.
>>>> 
>>>> - Data frameworks!!  Like the multi-tenancy conversation, what happens
>>> when
>>>> you want to have Drill or Impala, plus Map Reduce V2 (multiple of
>> these),
>>>> plus Spark, or Storm, or Kafka all working together.  With Yarn now,
>> you
>>>> it's much more locked in to a monolithic cluster, still with static
>>>> partitioning all over the place (think a Cloudera cluster with Yarn,
>>> Impala
>>>> and Hive... want to change something? You have to make sure all the
>>> pieces
>>>> change together)  With Mesos/Myriad, you have the flexibility to move
>> and
>>>> try new things, with minimal impact to your production, without
>> standing
>>> up
>>>> addition servers/clusters.  Myriad is the missing link here in that
>> YARN
>>>> only applications (Map ReduceV2!!!) are now part of that vision for a
>>>> unified data center, you no longer have to make a choice between Myriad
>>> or
>>>> Yarn, now it's Myriad AND Yarn.
>>>> 
>>>> Those are the points that get me excited, ecosystem lock in a huge
>>> concern
>>>> for many enterprises.   I don't want to imply I am not excited about
>> the
>>>> dynamic flexup/flexdown or the HA components, obviously those are
>> awesome
>>>> too, but for me those are cherries on top to the other components that
>>> let
>>>> me envision a data environment where options exist everywhere, where
>>>> innovation can happen faster, and I never have a situation where an
>> idea
>>> is
>>>> left on the cutting room floor because We don't support X.
>>>> 
>>>> Random thoughts from me...
>>>> 
>>>> John
>>>> 
>>>> 
>>>> 
>>>>> On Fri, Sep 25, 2015 at 7:59 AM, Jim Klucar <[email protected]> wrote:
>>>>> 
>>>>> Awesome. I assume it was good talk? I need to get better at answering
>>> the
>>>>> "Why would you want to do that?" question.
>>>>> 
>>>>>> On Thu, Sep 24, 2015 at 9:08 PM, Ken Sipe <[email protected]> wrote:
>>>>>> 
>>>>>> I just gave a talk at the cassandra summit.  It included details
>> around
>>>>>> spark and analytics with cassandra in the cluster.  There were lots
>> of
>>>>>> questions, etc.   I just wanted to let this group know that the 2nd
>>>>> largest
>>>>>> topic of conversation and questions was around myriad… there was a
>> lot
>>> of
>>>>>> excitement for our project.
>>>>>> 
>>>>>> Ken
> 
> 
> 
> -- 
> Ruth Harris
> Sr. Technical Writer, MapR

Reply via email to