+1 to what kamil said. That is exactly the reason why we designed it that way.
Also, the why is included in the status update message. @vinodkone > On Feb 12, 2016, at 6:08 AM, David J. Palaitis <[email protected]> > wrote: > > In larger deployments, with many applications, you may not always be able to > ask good memory practices from app developers. We've found that reporting > *why* a job was killed, with details of container utilization, is an > effective way of helping app developers get better at mem mgmt. > > The alternative, just having jobs die, incentives bad behaviors. For example, > a hurried job owner may just double memory of the executor, trading slack for > stability. > >> On Fri, Feb 12, 2016 at 6:36 AM Harry Metske <[email protected]> wrote: >> We don't want to use Docker (yet) in this environment, so >> DockerContainerizer is not an option. >> After thinking a bit longer, I tend to agree with Kamil and let the problem >> be handled differently. >> >> Thanks for the amazing fast responses! >> >> kind regards, >> Harry >> >> >> On 12 February 2016 at 12:28, Kamil Chmielewski <[email protected]> wrote: >>>>>>> On Fri, Feb 12, 2016 at 6:12 PM, Harry Metske <[email protected]> >>>>>>> wrote: >>>>>> >>>>>>> >>>>>>> Is there a specific reason why the slave does not first send a TERM >>>>>>> signal, and if that does not help after a certain timeout, send a KILL >>>>>>> signal? >>>>>>> That would give us a chance to cleanup consul registrations (and other >>>>>>> cleanup). >>> >>> First of all it's wrong that you want to handle memory limit in your app. >>> Things like this are outside of its scope. Your app can be lost because >>> many different system or hardware failures that you just can't caught. You >>> need to let it crash and design your architecture with this in mind. >>> Secondly Mesos SIGKILL is consistent with linux OOM killer and it do the >>> right thing >>> https://github.com/torvalds/linux/blob/4e5448a31d73d0e944b7adb9049438a09bc332cb/mm/oom_kill.c#L586 >>> >>> Best regards, >>> Kamil

