Thank you Ruth!! Love that you are helping with that! Sent from my iPhone
> On Sep 25, 2015, at 11:23 AM, Ruth Harris <[email protected]> wrote: > > Okee dokey! > > I'll read through and pull John and Ken's descriptions into the wiki. > > Thank John and Ken!! > > --ruth > >> On Fri, Sep 25, 2015 at 8:48 AM, Jim Klucar <[email protected]> wrote: >> >> That is a great reply, thanks. We should cut/paste it into the wiki. >> >>> On Fri, Sep 25, 2015 at 10:23 AM, Ken Sipe <[email protected]> wrote: >>> >>> that is my list as well… the bullet points >>> >>> * multi-versions of hadoop in the same cluster (we aren’t there yet) >>> * scale down v1 of hadoop as you scale up v2 (completely different way of >>> “decommissioning” services) >>> * co-located services and data >>> * multi tenant (manage hadoop, spark, kubernetes and other mesos services >>> with 1 view into the resource / capacity utilization) >>> * scale up yarn dynamically to utilities dc resources during off peak >>> availability (imagine how awesome this will be after over provisioning is >>> in place) >>> >>> Ken >>> >>>> On Sep 25, 2015, at 9:13 AM, John Omernik <[email protected]> wrote: >>>> >>>> "Why would you want to do that?" >>>> >>>> As a potential user of Myriad, in the enterprise I see a number of >>> reasons >>>> I'd "want to do that" they are: >>>> >>>> - The ability to use Mesos' purpose built and well design resource >>>> management with Map Reduce. Right now Yarn is is the only option to run >>> Map >>>> Reduce V2 Applications, and while Yarn is far superior to Resource >>>> Management in Map Reduce V1, we have still have an important >> application >>>> that is intrinsically tied to the resource schedule. Things that run on >>>> resource schedulers should not be tied to them. Map Reduce V2 should >> not >>>> have a specific resource scheduler as a requirement. >>>> >>>> - Multi Tenancy: Right now if you have a cluster of computers, you can >>> run >>>> one Yarn cluster on them. With Myriad, the option exists to have >> smaller >>>> clusters, that are purpose built running on one set of harder, think a >>> Yarn >>>> cluster for marketing, or one for HR. This is great option for better >>>> utilizing your resources, as well as better scaling growth and costs >>>> associated with growth. Consider setting up separate clusters in Yarn >>>> without Mesos: Many services duplicated, VMs or Physical node >> management >>>> issues, etc. >>>> >>>> - To build on Multi Tenancy, consider different version of Yarn and Map >>>> Reduce. Right now, a new feature or bug fix comes out in a version of >>> Yarn, >>>> and there is not a good way to put that into play with your data. You >>> have >>>> to go through horrible testing process just to upgrade, and you have to >>>> make sure ALL other jobs are not affected by the upgrade. With Myriad, >>> keep >>>> your production jobs at version X of yarn, and then spin up a new Yarn >>>> cluster at version x+1. Now you can test your jobs slowly, and >> migrated >>>> them one by one without impact to production processes. Upgrading is >> now >>>> not all or nothing, but a controlled process where you can "fail fast" >>> i.e. >>>> if the job doesn't work, roll it back to the older version of Yarn. >>>> >>>> - The ability to have applications (think Docker containers) sitting >>> right >>>> next to the data (Hadoop data) they may be interacting with. Monitoring >>> all >>>> the jobs in one place rather than distinct clusters for containers and >>>> others for data frameworks. >>>> >>>> - Data frameworks!! Like the multi-tenancy conversation, what happens >>> when >>>> you want to have Drill or Impala, plus Map Reduce V2 (multiple of >> these), >>>> plus Spark, or Storm, or Kafka all working together. With Yarn now, >> you >>>> it's much more locked in to a monolithic cluster, still with static >>>> partitioning all over the place (think a Cloudera cluster with Yarn, >>> Impala >>>> and Hive... want to change something? You have to make sure all the >>> pieces >>>> change together) With Mesos/Myriad, you have the flexibility to move >> and >>>> try new things, with minimal impact to your production, without >> standing >>> up >>>> addition servers/clusters. Myriad is the missing link here in that >> YARN >>>> only applications (Map ReduceV2!!!) are now part of that vision for a >>>> unified data center, you no longer have to make a choice between Myriad >>> or >>>> Yarn, now it's Myriad AND Yarn. >>>> >>>> Those are the points that get me excited, ecosystem lock in a huge >>> concern >>>> for many enterprises. I don't want to imply I am not excited about >> the >>>> dynamic flexup/flexdown or the HA components, obviously those are >> awesome >>>> too, but for me those are cherries on top to the other components that >>> let >>>> me envision a data environment where options exist everywhere, where >>>> innovation can happen faster, and I never have a situation where an >> idea >>> is >>>> left on the cutting room floor because We don't support X. >>>> >>>> Random thoughts from me... >>>> >>>> John >>>> >>>> >>>> >>>>> On Fri, Sep 25, 2015 at 7:59 AM, Jim Klucar <[email protected]> wrote: >>>>> >>>>> Awesome. I assume it was good talk? I need to get better at answering >>> the >>>>> "Why would you want to do that?" question. >>>>> >>>>>> On Thu, Sep 24, 2015 at 9:08 PM, Ken Sipe <[email protected]> wrote: >>>>>> >>>>>> I just gave a talk at the cassandra summit. It included details >> around >>>>>> spark and analytics with cassandra in the cluster. There were lots >> of >>>>>> questions, etc. I just wanted to let this group know that the 2nd >>>>> largest >>>>>> topic of conversation and questions was around myriad… there was a >> lot >>> of >>>>>> excitement for our project. >>>>>> >>>>>> Ken > > > > -- > Ruth Harris > Sr. Technical Writer, MapR
