2012-05-30 0:00, Milan Jurik wrote:
Jim,

Jim Klimov píše v út 29. 05. 2012 v 04:16 +0400:
2012-05-29 3:39, Jim Klimov wrote:
Hello all,

On a test box running OI oi_151a3 I found that it can not restart
services due to svc.startd being 2500M in VM size. As a consequence,
fork() fails due to insufficient free VM (swap space) and services
can not start.

I guess there is a leak somewhere, was anything like that fixed in
the past month or so?

A bit more research points to an old problem, marked as fixed:
http://defect.opensolaris.org/bz/show_bug.cgi?id=15761

This box did have a pkg/server (startd/duration=child) instance
which did not start well and was left astray. From what I see
now, a daemon is started and in particular grabs the TCP port,
but SMF thinks the service has failed and restarts it. Further
invokations fail due to busy port, but the service does not go
into maintenance. I see svc.startd occupying more RAM at a rate
of about 1Mb/1-2mins (glancing at top).

The symptomatic part can be fixed by simply changing the base
pkg/server startd/duration attribute from "child" to "contract",
but the core problem - svc.startd leaking memory in case of such
unlimited restarts - is still in place. Also, for the past ten
minutes or so since I fixed the pkg/server, the restarter hasn't
released a byte ;)


Is https://www.illumos.org/issues/2801 your problem? :-)

Well, I did not trace that yet, but roughly agree that the
description might fit ;)

I believe the core of the problem is not pkg/server itself,
it just causes it to show. I think the problem is with some
likely infinite loop used to restart failed "child" services,
since those restarts do not cause maintenance (by def?) and
instead loop restarting. Maybe some recursion happens instead
of a cycle? That could eat up RAM at least... i.e.:
* initial start
** detected an error condition
*** start again
**** detected an error condition
***** start again
****** detected an error condition
******* start again
...

instead of
* initial start
** detected an error condition
* start again
** detected an error condition
* start again
** detected an error condition
* start again
...

//Jim

_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

Reply via email to