Hi all, I was running a Mesos cluster on EC2 with c4.8xlarge instance types when one of the status checks failed. We are running Mesos 0.22.1 on ubuntu 14.04, with kernel version 3.13.0-55-generic. EC2 gave us this console output[1]. I did some searching and found similar issues reported here[2] on lkml, though those logs indicated a specific task and an older kernel, while these logs just show mesos-slave as the causative process.
Unfortunately, the instance was terminated so I'm not sure how much useful debugging can be done. Is this a known issue? We are also using a our own python executor, could an error there have caused this? [1] http://pastebin.com/NgHi8MnS [2] https://lkml.org/lkml/2014/9/30/498 Thanks, Chris