That is a great reply, thanks. We should cut/paste it into the wiki. On Fri, Sep 25, 2015 at 10:23 AM, Ken Sipe <[email protected]> wrote:
> that is my list as well… the bullet points > > * multi-versions of hadoop in the same cluster (we aren’t there yet) > * scale down v1 of hadoop as you scale up v2 (completely different way of > “decommissioning” services) > * co-located services and data > * multi tenant (manage hadoop, spark, kubernetes and other mesos services > with 1 view into the resource / capacity utilization) > * scale up yarn dynamically to utilities dc resources during off peak > availability (imagine how awesome this will be after over provisioning is > in place) > > Ken > > > On Sep 25, 2015, at 9:13 AM, John Omernik <[email protected]> wrote: > > > > "Why would you want to do that?" > > > > As a potential user of Myriad, in the enterprise I see a number of > reasons > > I'd "want to do that" they are: > > > > - The ability to use Mesos' purpose built and well design resource > > management with Map Reduce. Right now Yarn is is the only option to run > Map > > Reduce V2 Applications, and while Yarn is far superior to Resource > > Management in Map Reduce V1, we have still have an important application > > that is intrinsically tied to the resource schedule. Things that run on > > resource schedulers should not be tied to them. Map Reduce V2 should not > > have a specific resource scheduler as a requirement. > > > > - Multi Tenancy: Right now if you have a cluster of computers, you can > run > > one Yarn cluster on them. With Myriad, the option exists to have smaller > > clusters, that are purpose built running on one set of harder, think a > Yarn > > cluster for marketing, or one for HR. This is great option for better > > utilizing your resources, as well as better scaling growth and costs > > associated with growth. Consider setting up separate clusters in Yarn > > without Mesos: Many services duplicated, VMs or Physical node management > > issues, etc. > > > > - To build on Multi Tenancy, consider different version of Yarn and Map > > Reduce. Right now, a new feature or bug fix comes out in a version of > Yarn, > > and there is not a good way to put that into play with your data. You > have > > to go through horrible testing process just to upgrade, and you have to > > make sure ALL other jobs are not affected by the upgrade. With Myriad, > keep > > your production jobs at version X of yarn, and then spin up a new Yarn > > cluster at version x+1. Now you can test your jobs slowly, and migrated > > them one by one without impact to production processes. Upgrading is now > > not all or nothing, but a controlled process where you can "fail fast" > i.e. > > if the job doesn't work, roll it back to the older version of Yarn. > > > > - The ability to have applications (think Docker containers) sitting > right > > next to the data (Hadoop data) they may be interacting with. Monitoring > all > > the jobs in one place rather than distinct clusters for containers and > > others for data frameworks. > > > > - Data frameworks!! Like the multi-tenancy conversation, what happens > when > > you want to have Drill or Impala, plus Map Reduce V2 (multiple of these), > > plus Spark, or Storm, or Kafka all working together. With Yarn now, you > > it's much more locked in to a monolithic cluster, still with static > > partitioning all over the place (think a Cloudera cluster with Yarn, > Impala > > and Hive... want to change something? You have to make sure all the > pieces > > change together) With Mesos/Myriad, you have the flexibility to move and > > try new things, with minimal impact to your production, without standing > up > > addition servers/clusters. Myriad is the missing link here in that YARN > > only applications (Map ReduceV2!!!) are now part of that vision for a > > unified data center, you no longer have to make a choice between Myriad > or > > Yarn, now it's Myriad AND Yarn. > > > > Those are the points that get me excited, ecosystem lock in a huge > concern > > for many enterprises. I don't want to imply I am not excited about the > > dynamic flexup/flexdown or the HA components, obviously those are awesome > > too, but for me those are cherries on top to the other components that > let > > me envision a data environment where options exist everywhere, where > > innovation can happen faster, and I never have a situation where an idea > is > > left on the cutting room floor because We don't support X. > > > > Random thoughts from me... > > > > John > > > > > > > > On Fri, Sep 25, 2015 at 7:59 AM, Jim Klucar <[email protected]> wrote: > > > >> Awesome. I assume it was good talk? I need to get better at answering > the > >> "Why would you want to do that?" question. > >> > >> On Thu, Sep 24, 2015 at 9:08 PM, Ken Sipe <[email protected]> wrote: > >> > >>> I just gave a talk at the cassandra summit. It included details around > >>> spark and analytics with cassandra in the cluster. There were lots of > >>> questions, etc. I just wanted to let this group know that the 2nd > >> largest > >>> topic of conversation and questions was around myriad… there was a lot > of > >>> excitement for our project. > >>> > >>> Ken > >> > >
