I'd like to share some findings if you don't mind. Lurking through the code shows that this behavior is expected. Each time we collect statistics (default is 1 SECOND) we: * Call pids() - which lists /proc directory. * For each pid we call status() - read /proc/<pid>/status and /proc/<pid>/cmdline At first I thought that calling status() lazily could help, but in described situation when almost all processes in the system are ours (i.e. children of our executor) - we'll end up calling status for almost every pid - no gain from lazyness.
Maybe we could increase resource_monitoring_interval to at least 5 seconds? On Tue Feb 17 2015 at 7:34:24 AM Niklas Nielsen <[email protected]> wrote: > Hi James, > > This could be related to https://issues.apache.org/jira/browse/MESOS-2254 > > NIklas > > On 15 February 2015 at 11:45, James DeFelice <[email protected]> > wrote: > > > I figured out the source of the defunct procs: > > > > https://github.com/mesosphere/kubernetes-mesos/issues/151 > > > > .. but I wondered if anyone else on this list has had similar experiences > > with mesos/stout's os:pstree? (additional debug info at the link above) > > > > -James > > >
