Thanks @Haosdent for the link to explain the shutdown errors. so I can
ignore this...
@Joris,
1. I upgraded form 0.25.0 to 0.28.2 in centos 7 which has systemd support.
2. I didn't make any OS / init system changes
For "That indicates a transition from the old systemd lack of support to
the new support. "
>> lack of what support ? would explain more details, and how to fix
this? or may have other cause ?
Thanks great again!
On 2016年06月17日 21:31, Joris Van Remoortere wrote:
Boxbe <https://www.boxbe.com/overview> This message is eligible for
Automatic Cleanup! ([email protected]) Add cleanup rule
<https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Fkey%3DINo0V0shoF5SDDeFNLmOQcDrkM6vuyhBbTAdJ5Ek4fI%253D%26token%3D5pye7msFkBYF5q0SSLYtlGWaWu8a6Imv%252F0E2lgbtu%252BgVEFau%252BV9i3BQYfTGspspkIaoukz1oy8IOSGPyscO1GfcEZlPEs2k3hUGSvAHO6cSuBmHqxd7TnZwBy5RkAx7yt2on45nEbm4%253D&tc_serial=25796382411&tc_rand=1671551284&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
| More info
<http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=25796382411&tc_rand=1671551284&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
The shutdown errors are not the issue.
The concerning part is this warning:
W0615 15:01:43.285518 4182 linux_launcher.cpp:197] Couldn't find
pid '42322' in 'mesos_executors.slice'. This can lead to lack of
proper resource isolation
That indicates a transition from the old systemd lack of support to
the new support.
—
*Joris Van Remoortere*
Mesosphere
On Fri, Jun 17, 2016 at 2:35 PM, haosdent <[email protected]
<mailto:[email protected]>> wrote:
Hi, @Qiang.
@Joseph have a nice explain about at Shutdown failed on fd
http://search-hadoop.com/m/0Vlr6pe7qb2MJX8B1&subj=Re+Benign+Shutdown+failed+on+fd+error+messages
Those errors could be ignored.
For
```
I0615 15:01:43.324935 4172 mem.cpp:602] Started listening for OOM
events
for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
```
These are normal info log, it happen when Mesos CgroupMemIsolator
register
oom hooks for your containers.
On Fri, Jun 17, 2016 at 8:22 PM, Joris Van Remoortere
<[email protected] <mailto:[email protected]>>
wrote:
> Can you provide:
> 1. The version that you are upgrading from.
> 2. Whether you made any OS / init system changes alongside this
upgrade
> (just to narrow the scope).
>
> It is possible that you are upgrading from a version that did
not have
> systemd support to one that does. If so, the upgrade may require
restarting
> the tasks (either by themselves, or just starting a fresh
agent). Please
> check out some of the work in MESOS-3007 to get a better
understanding of
> what the issue I am referring to is.
>
> If you can verify that you are making one of these transitions
from a bad
> world to a good world, then you can devise a plan for your upgrade.
>
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Jun 17, 2016 at 8:28 AM, Qiang Chen <[email protected]
<mailto:[email protected]>> wrote:
>
> > Hi all,
> >
> > I met an issue when upgrading mesos-slave to 0.28.2.
> >
> > At the process of recovering mesos-slave / framework container
stage, it
> > produced the following errors.
> >
> >
> > ```
> > Log file created at: 2016/06/15 15:01:43
> > Running on machine: mesos-slave-online005-xxx.cloud.xxx.domain
> > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid
file:line] msg
> > W0615 15:01:43.285518 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42322' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > W0615 15:01:43.286182 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42312' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > W0615 15:01:43.286669 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42309' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > W0615 15:01:43.287144 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42304' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > W0615 15:01:43.287636 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42300' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > W0615 15:01:43.288120 4182 linux_launcher.cpp:197] Couldn't
find pid
> > '42317' in 'mesos_executors.slice'. This can lead to lack of
proper
> > resource isolation
> > E0615 15:01:43.471676 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> > E0615 15:01:43.476007 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> > E0615 15:01:43.476143 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> > E0615 15:01:43.476272 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> > E0615 15:01:43.476483 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> > E0615 15:01:43.476618 4201 process.cpp:1958] Failed to
shutdown socket
> > with fd 24: Transport endpoint is not connected
> >
> > ```
> >
> > And it will also cause the OOM errors, such as:
> >
> > ```
> > I0615 15:01:43.324935 4172 mem.cpp:602] Started listening for
OOM events
> > for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > I0615 15:01:43.325469 4172 mem.cpp:722] Started listening on
low memory
> > pressure events for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > I0615 15:01:43.326004 4172 mem.cpp:722] Started listening on
medium
> > memory pressure events for container
f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> > I0615 15:01:43.326539 4172 mem.cpp:722] Started listening on
critical
> > memory pressure events for container
f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
> >
> > ```
> >
> > Did someone suffer this? thanks.
> >
> > --
> > Best Regards,
> > Chen, Qiang
> >
> >
>
--
Best Regards,
Haosdent Huang
--
Best Regards,
Chen, Qiang