>
> For "That indicates a transition from the old systemd lack of support to
> the new support. "
> >> lack of what support ? would explain more details, and how to fix this?
> or may have other cause ?


There were a few versions of Mesos where we were not yet aware of some of
the issues with running under systemd. There was a fix for the
LinuxLauncher in 0.25 (https://issues.apache.org/jira/browse/MESOS-3425)
and further fixes for the posix launcher and docker containerizer in 0.28
and some backports. See the systemd documentation at the bottom of this
page: http://mesos.apache.org/documentation/latest/agent-recovery/

It's possible that you have tasks left over from before we had this
support, which means they are not running under the executor slice. These
technically could lose their isolation (as mentioned in the warning). If
you care about the isolation (you likely do in production), then the only
remedy is to restart them.

—
*Joris Van Remoortere*
Mesosphere

On Mon, Jun 20, 2016 at 4:45 AM, Qiang Chen <[email protected]> wrote:

> Thanks @Haosdent for the link to explain the shutdown errors. so I can
> ignore this...
>
> @Joris,
>
> 1. I upgraded form 0.25.0 to 0.28.2 in centos 7 which  has systemd support.
> 2. I didn't make any OS / init system changes
>
> For "That indicates a transition from the old systemd lack of support to
> the new support. "
> >> lack of what support ? would explain more details, and how to fix this?
> or may have other cause ?
>
> Thanks great again!
>
>
> On 2016年06月17日 21:31, Joris Van Remoortere wrote:
>
> [image: Boxbe] <https://www.boxbe.com/overview> This message is eligible
> for Automatic Cleanup! ([email protected]) Add cleanup rule
> <https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Fkey%3DINo0V0shoF5SDDeFNLmOQcDrkM6vuyhBbTAdJ5Ek4fI%253D%26token%3D5pye7msFkBYF5q0SSLYtlGWaWu8a6Imv%252F0E2lgbtu%252BgVEFau%252BV9i3BQYfTGspspkIaoukz1oy8IOSGPyscO1GfcEZlPEs2k3hUGSvAHO6cSuBmHqxd7TnZwBy5RkAx7yt2on45nEbm4%253D&tc_serial=25796382411&tc_rand=1671551284&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
> | More info
> <http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=25796382411&tc_rand=1671551284&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
>
>
> The shutdown errors are not the issue.
> The concerning part is this warning:
>
>> W0615 15:01:43.285518  4182 linux_launcher.cpp:197] Couldn't find pid
>> '42322' in 'mesos_executors.slice'. This can lead to lack of proper
>> resource isolation
>
> That indicates a transition from the old systemd lack of support to the
> new support.
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Jun 17, 2016 at 2:35 PM, haosdent <[email protected]> wrote:
>
>> Hi, @Qiang.
>>
>> @Joseph have a nice explain about at Shutdown failed on fd
>>
>> http://search-hadoop.com/m/0Vlr6pe7qb2MJX8B1&subj=Re+Benign+Shutdown+failed+on+fd+error+messages
>> Those errors could be ignored.
>>
>> For
>> ```
>> I0615 15:01:43.324935  4172 mem.cpp:602] Started listening for OOM events
>> for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
>> ```
>>
>> These are normal info log, it happen when Mesos CgroupMemIsolator register
>> oom hooks for your containers.
>>
>> On Fri, Jun 17, 2016 at 8:22 PM, Joris Van Remoortere <
>> <[email protected]>[email protected]>
>> wrote:
>>
>> > Can you provide:
>> > 1. The version that you are upgrading from.
>> > 2. Whether you made any OS / init system changes alongside this upgrade
>> > (just to narrow the scope).
>> >
>> > It is possible that you are upgrading from a version that did not have
>> > systemd support to one that does. If so, the upgrade may require
>> restarting
>> > the tasks (either by themselves, or just starting a fresh agent). Please
>> > check out some of the work in MESOS-3007 to get a better understanding
>> of
>> > what the issue I am referring to is.
>> >
>> > If you can verify that you are making one of these transitions from a
>> bad
>> > world to a good world, then you can devise a plan for your upgrade.
>> >
>> > Joris
>> >
>> > —
>> > *Joris Van Remoortere*
>> > Mesosphere
>> >
>> > On Fri, Jun 17, 2016 at 8:28 AM, Qiang Chen < <[email protected]>
>> [email protected]> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I met an issue when upgrading mesos-slave to 0.28.2.
>> > >
>> > > At the process of recovering mesos-slave / framework container stage,
>> it
>> > > produced the following errors.
>> > >
>> > >
>> > > ```
>> > > Log file created at: 2016/06/15 15:01:43
>> > > Running on machine: mesos-slave-online005-xxx.cloud.xxx.domain
>> > > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
>> > > W0615 15:01:43.285518  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42322' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > W0615 15:01:43.286182  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42312' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > W0615 15:01:43.286669  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42309' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > W0615 15:01:43.287144  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42304' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > W0615 15:01:43.287636  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42300' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > W0615 15:01:43.288120  4182 linux_launcher.cpp:197] Couldn't find pid
>> > > '42317' in 'mesos_executors.slice'. This can lead to lack of proper
>> > > resource isolation
>> > > E0615 15:01:43.471676  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > > E0615 15:01:43.476007  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > > E0615 15:01:43.476143  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > > E0615 15:01:43.476272  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > > E0615 15:01:43.476483  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > > E0615 15:01:43.476618  4201 process.cpp:1958] Failed to shutdown
>> socket
>> > > with fd 24: Transport endpoint is not connected
>> > >
>> > > ```
>> > >
>> > > And it will also cause the OOM errors, such as:
>> > >
>> > > ```
>> > > I0615 15:01:43.324935  4172 mem.cpp:602] Started listening for OOM
>> events
>> > > for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
>> > > I0615 15:01:43.325469 4172 mem.cpp:722] Started listening on low
>> memory
>> > > pressure events for container f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
>> > > I0615 15:01:43.326004  4172 mem.cpp:722] Started listening on medium
>> > > memory pressure events for container
>> f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
>> > > I0615 15:01:43.326539  4172 mem.cpp:722] Started listening on critical
>> > > memory pressure events for container
>> f50b4c7a-d1d2-4fc8-abb9-5ab549f168dc
>> > >
>> > > ```
>> > >
>> > > Did someone suffer this? thanks.
>> > >
>> > > --
>> > > Best Regards,
>> > > Chen, Qiang
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
> --
> Best Regards,
> Chen, Qiang
>
>

Reply via email to