I'd like to share some findings if you don't mind.

Lurking through the code shows that this behavior is expected.
Each time we collect statistics (default is 1 SECOND) we:
* Call pids() - which lists /proc directory.
* For each pid we call status() - read /proc/<pid>/status and
/proc/<pid>/cmdline
At first I thought that calling status() lazily could help, but in
described situation when almost all processes in the system
are ours (i.e. children of our executor) - we'll end up calling status for
almost every pid - no gain from lazyness.

Maybe we could increase resource_monitoring_interval to at least 5 seconds?

On Tue Feb 17 2015 at 7:34:24 AM Niklas Nielsen <[email protected]>
wrote:

> Hi James,
>
> This could be related to https://issues.apache.org/jira/browse/MESOS-2254
>
> NIklas
>
> On 15 February 2015 at 11:45, James DeFelice <[email protected]>
> wrote:
>
> > I figured out the source of the defunct procs:
> >
> > https://github.com/mesosphere/kubernetes-mesos/issues/151
> >
> > .. but I wondered if anyone else on this list has had similar experiences
> > with mesos/stout's os:pstree? (additional debug info at the link above)
> >
> > -James
> >
>

Reply via email to