Re: non-prod SLA stats

2015-06-16 Thread Erb, Stephan
Hi Maxim, I have submitted a first patch closely following your initial proposal. The patch needs another iteration or two, so please let me know what you think. https://reviews.apache.org/r/35498/ Thanks, Stephan From: Maxim Khutornenko Sent: Monday

Weekly GSoC Summaries | Week of June 8th

2015-06-16 Thread Willy Aguirre
Weekly GSoC Summaries Week of June 8th As a Google Summer of Code student I’m working on a javascript library to power an interactive web guide to teach how the Aurora CLI works. You can follow the project on GitHub , and I’ll share my weekly progress wit

Lost jobs on cluster failure

2015-06-16 Thread Mauricio Garavaglia
Hello! We had a issue with our aurora mesos cluster that make it to lose quorum. And we are wondering how the recover of lost jobs works. So, what happen is basically #1 Start Aurora job, and have it allocated to node A. #2 Aurora Schedulers, Mesos Master and ZK stopped #3 node A stopped #4 Auror

Re: Lost jobs on cluster failure

2015-06-16 Thread Maxim Khutornenko
Not sure I am getting the problem here. Are you observing Mesos master, Aurora leader or a native log quorum loss? To your questions, every part of the Aurora/Mesos system is designed in a failure-tolerant manner. A loss of Mesos master, Aurora leader or a Mesos slave should not cause any irrecove

Re: Lost jobs on cluster failure

2015-06-16 Thread Bill Farner
Maxim's reply is correct, elaborating Should it assume the Mesos list is complete, and assume the missing nodes > are indeed gone, and hence restart the jobs? Yes. This scenario is currently reconciled by the GC executor, which runs on an hourly interval by default. This behavior is soon to be

Re: arbitrary parameters for docker containers

2015-06-16 Thread Bill Farner
Hey Mauricio! Thanks for the contribution! Please accept our apologies for the delay, there are two reasons this review slipped through the cracks: 1.) our review bot replied indicating that the patch breaks the build. Due to the review volume we deal with, reviewers tend to skim past reviews w

Re: cmdline and docker entrypoints

2015-06-16 Thread Bill Farner
Did you ever figure this out? The entrypoint will indeed be replaced, and the command supplied in the Aurora configuration will be used. I can see how this might be unexpected, especially when using an off-the-shelf docker container. -=Bill On Sat, May 16, 2015 at 8:32 PM, Mauricio Garavaglia <