that is my list as well… the bullet points * multi-versions of hadoop in the same cluster (we aren’t there yet) * scale down v1 of hadoop as you scale up v2 (completely different way of “decommissioning” services) * co-located services and data * multi tenant (manage hadoop, spark, kubernetes and other mesos services with 1 view into the resource / capacity utilization) * scale up yarn dynamically to utilities dc resources during off peak availability (imagine how awesome this will be after over provisioning is in place)
Ken > On Sep 25, 2015, at 9:13 AM, John Omernik <[email protected]> wrote: > > "Why would you want to do that?" > > As a potential user of Myriad, in the enterprise I see a number of reasons > I'd "want to do that" they are: > > - The ability to use Mesos' purpose built and well design resource > management with Map Reduce. Right now Yarn is is the only option to run Map > Reduce V2 Applications, and while Yarn is far superior to Resource > Management in Map Reduce V1, we have still have an important application > that is intrinsically tied to the resource schedule. Things that run on > resource schedulers should not be tied to them. Map Reduce V2 should not > have a specific resource scheduler as a requirement. > > - Multi Tenancy: Right now if you have a cluster of computers, you can run > one Yarn cluster on them. With Myriad, the option exists to have smaller > clusters, that are purpose built running on one set of harder, think a Yarn > cluster for marketing, or one for HR. This is great option for better > utilizing your resources, as well as better scaling growth and costs > associated with growth. Consider setting up separate clusters in Yarn > without Mesos: Many services duplicated, VMs or Physical node management > issues, etc. > > - To build on Multi Tenancy, consider different version of Yarn and Map > Reduce. Right now, a new feature or bug fix comes out in a version of Yarn, > and there is not a good way to put that into play with your data. You have > to go through horrible testing process just to upgrade, and you have to > make sure ALL other jobs are not affected by the upgrade. With Myriad, keep > your production jobs at version X of yarn, and then spin up a new Yarn > cluster at version x+1. Now you can test your jobs slowly, and migrated > them one by one without impact to production processes. Upgrading is now > not all or nothing, but a controlled process where you can "fail fast" i.e. > if the job doesn't work, roll it back to the older version of Yarn. > > - The ability to have applications (think Docker containers) sitting right > next to the data (Hadoop data) they may be interacting with. Monitoring all > the jobs in one place rather than distinct clusters for containers and > others for data frameworks. > > - Data frameworks!! Like the multi-tenancy conversation, what happens when > you want to have Drill or Impala, plus Map Reduce V2 (multiple of these), > plus Spark, or Storm, or Kafka all working together. With Yarn now, you > it's much more locked in to a monolithic cluster, still with static > partitioning all over the place (think a Cloudera cluster with Yarn, Impala > and Hive... want to change something? You have to make sure all the pieces > change together) With Mesos/Myriad, you have the flexibility to move and > try new things, with minimal impact to your production, without standing up > addition servers/clusters. Myriad is the missing link here in that YARN > only applications (Map ReduceV2!!!) are now part of that vision for a > unified data center, you no longer have to make a choice between Myriad or > Yarn, now it's Myriad AND Yarn. > > Those are the points that get me excited, ecosystem lock in a huge concern > for many enterprises. I don't want to imply I am not excited about the > dynamic flexup/flexdown or the HA components, obviously those are awesome > too, but for me those are cherries on top to the other components that let > me envision a data environment where options exist everywhere, where > innovation can happen faster, and I never have a situation where an idea is > left on the cutting room floor because We don't support X. > > Random thoughts from me... > > John > > > > On Fri, Sep 25, 2015 at 7:59 AM, Jim Klucar <[email protected]> wrote: > >> Awesome. I assume it was good talk? I need to get better at answering the >> "Why would you want to do that?" question. >> >> On Thu, Sep 24, 2015 at 9:08 PM, Ken Sipe <[email protected]> wrote: >> >>> I just gave a talk at the cassandra summit. It included details around >>> spark and analytics with cassandra in the cluster. There were lots of >>> questions, etc. I just wanted to let this group know that the 2nd >> largest >>> topic of conversation and questions was around myriad… there was a lot of >>> excitement for our project. >>> >>> Ken >>
