The shutdown errors are not the issue. The concerning part is this warning:
> W0615 15:01:43.285518 4182 linux_launcher.cpp:197] Couldn't find pid > '42322' in 'mesos_executors.slice'. This can lead to lack of proper > resource isolation That indicates a transition from the old systemd lack of support to the new support. — *Joris Van Remoortere* Mesosphere On Fri, Jun 17, 2016 at 2:35 PM, haosdent <[email protected]> wrote: > Hi, @Qiang. > > @Joseph have a nice explain about at Shutdown failed on fd > > http://search-hadoop.com/m/0Vlr6pe7qb2MJX8B1&subj=Re+Benign+Shutdown+failed+on+fd+error+messages > Those errors could be ignored. > > For > ``` > I0615 15:01:43.324935 4172 mem.cpp:602] Started listening for OOM events > for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc > ``` > > These are normal info log, it happen when Mesos CgroupMemIsolator register > oom hooks for your containers. > > On Fri, Jun 17, 2016 at 8:22 PM, Joris Van Remoortere <[email protected] > > > wrote: > > > Can you provide: > > 1. The version that you are upgrading from. > > 2. Whether you made any OS / init system changes alongside this upgrade > > (just to narrow the scope). > > > > It is possible that you are upgrading from a version that did not have > > systemd support to one that does. If so, the upgrade may require > restarting > > the tasks (either by themselves, or just starting a fresh agent). Please > > check out some of the work in MESOS-3007 to get a better understanding of > > what the issue I am referring to is. > > > > If you can verify that you are making one of these transitions from a bad > > world to a good world, then you can devise a plan for your upgrade. > > > > Joris > > > > — > > *Joris Van Remoortere* > > Mesosphere > > > > On Fri, Jun 17, 2016 at 8:28 AM, Qiang Chen <[email protected]> wrote: > > > > > Hi all, > > > > > > I met an issue when upgrading mesos-slave to 0.28.2. > > > > > > At the process of recovering mesos-slave / framework container stage, > it > > > produced the following errors. > > > > > > > > > ``` > > > Log file created at: 2016/06/15 15:01:43 > > > Running on machine: mesos-slave-online005-xxx.cloud.xxx.domain > > > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > > > W0615 15:01:43.285518 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42322' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > W0615 15:01:43.286182 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42312' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > W0615 15:01:43.286669 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42309' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > W0615 15:01:43.287144 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42304' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > W0615 15:01:43.287636 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42300' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > W0615 15:01:43.288120 4182 linux_launcher.cpp:197] Couldn't find pid > > > '42317' in 'mesos_executors.slice'. This can lead to lack of proper > > > resource isolation > > > E0615 15:01:43.471676 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > E0615 15:01:43.476007 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > E0615 15:01:43.476143 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > E0615 15:01:43.476272 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > E0615 15:01:43.476483 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > E0615 15:01:43.476618 4201 process.cpp:1958] Failed to shutdown socket > > > with fd 24: Transport endpoint is not connected > > > > > > ``` > > > > > > And it will also cause the OOM errors, such as: > > > > > > ``` > > > I0615 15:01:43.324935 4172 mem.cpp:602] Started listening for OOM > events > > > for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc > > > I0615 15:01:43.325469 4172 mem.cpp:722] Started listening on low memory > > > pressure events for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc > > > I0615 15:01:43.326004 4172 mem.cpp:722] Started listening on medium > > > memory pressure events for container > f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc > > > I0615 15:01:43.326539 4172 mem.cpp:722] Started listening on critical > > > memory pressure events for container > f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc > > > > > > ``` > > > > > > Did someone suffer this? thanks. > > > > > > -- > > > Best Regards, > > > Chen, Qiang > > > > > > > > > > > > -- > Best Regards, > Haosdent Huang >
