Hi Chris,



Perhaps you've run into 
https://community.nitrous.io/posts/stability-and-a-linux-oom-killer-bug. We ran 
into similar symptoms that you've described and taking the above as the cause 
solved all of our issues.




Hope this helps!



--


Tom Arnfeld

Developer // DueDil





(+44) 7525940046

25 Christopher Street, London, EC2A 2BS

On Mon, Aug 31, 2015 at 11:55 PM, Christopher Ketchum <[email protected]>
wrote:

> Hi all,
> I was running a Mesos cluster on EC2 with c4.8xlarge instance types when
> one of the status checks failed. We are running Mesos 0.22.1 on ubuntu
> 14.04, with kernel version 3.13.0-55-generic. EC2 gave us this console
> output[1]. I did some searching and found similar issues reported here[2]
> on lkml, though those logs indicated a specific task and an older kernel,
> while these logs just show mesos-slave as the causative process.
> Unfortunately, the instance was terminated so I'm not sure how much useful
> debugging can be done. Is this a known issue? We are also using a our own
> python executor, could an error there have caused this?
> [1] http://pastebin.com/NgHi8MnS
> [2] https://lkml.org/lkml/2014/9/30/498
> Thanks,
> Chris

Reply via email to