Hi John, Thanks for testing out Myriad and sharing your experiences. We look forward to hearing more about your use cases and issues. I'll take first stab at your questions/comments and let others fill in where appropriate. 1. Hive: Glad it's working. Are you running Hive on YARN, or on MR1, or directly on Mesos? 2. I agree that a "desired number of instances" API would be valuable as well. I created Issue 86 <https://github.com/mesos/myriad/issues/86> to track it. The auto-scaling issue 12 <https://github.com/mesos/myriad/issues/12> is also related. 3. First suspend/scale it down in Marathon to kill the scheduler (and prevent it from being restarted), then call the http://master:5050/master/shutdown endpoint to shut it down in Mesos and kill all tasks. See the related SO question, and the "shutdown_frameworks" ACL. We require operator intervention through the shutdown endpoint so that a normal RM/scheduler failure will not bring down the YARN cluster or its tasks, since that scenario should be recoverable. 4. +1 on separating config. Filed Issue 87 <https://github.com/mesos/myriad/issues/87> 5. Thanks for the offer to help with testing. Hang around on this dev@ mailing list (we'll get user@ too once the traffic here gets heavy), and we'll call out to you if we have any specific testing requests. Please file any issues you find along the way, no matter how trivial/controversial.
Thanks again! -Adam- On Fri, Apr 3, 2015 at 7:02 AM, John Omernik <[email protected]> wrote: > Hey all, recently joined and wanted to share some success I am having with > Myriad on my test cluster. Obviously some of the issues that have been > talked about here and on the git issues I've run into, but all in all it's > been a great experience. (I had help). > > A few notes: > > My cluster is a Mesos based cluster running on top of a MapR filesystem > (4.0.2). It's working pretty well for things like Spark and Docker, MRv1 > is a hacked setup that I wouldn't recommend to anyone, but it was sorta > working. I do multiple things with this cluster, but one is a crude packet > capture process that really works well from an "edge case" point of view > due to the use of a hive transform and other crazy stuff. > > 1. Hive is working great. No issues there, tweaked some mapreduce > settings, added some profiles that fit my cluster and things seems to be > humming along well. > > 2. The API was confusing until it was explained to me. Basically, coming > from a marathon world, I saw the instances setting as the "number" of > instances I want running, rather than go up or down by x instances. I see > why this the API is setup like this, but perhaps some consideration to make > it more intuitive? Like an option to specify what you want to running > addition to the flex up and flex down. Also, on the flex down, is there > an option to specify which instances you want to flex down? On flex up, I > can setup 1 large, then run 2 medium, and then have 2 small running on the > cluster, but on the way down, it appears it's only the number of instances > I want flexed. > > 3. If I shut down the resource manager, (on purpose) there should be a way > to have that auto kill nodemanagers. Right? As of now if I want to reset > things, I need to scale down in marathon, then run a script on each node > that kills processes. > > 4. The myriad-config-default.yml needs to be moved outside the bundled jar > so we can update our clusters without rebuilding. I know this is alpha and > it's probably on a list, but I figured I'd mention it. (Perhaps checking > the location of executor, then class path etc). > > 5. I'd be happy to run though any tests or check any bugs people may want a > confirmation on with my cluster. It's not "production" but it is doing > work so I have some flexibility in changing things up. I wish I could do > more on the coding side, but I am more of a hacker/scripter than a java > dev, and would hate for any of my bad code to make it into a project like > this with so much potential. > > > All is in all, I am quite impressed. It seems more stable than my MRv1 on > Mesos/MapR so that's nice. Still playing with settings and other things, > and wanted to share some successes instead of just issues, thanks for all > the hard work here. > > John Omernik >
