Re: [zones-discuss] Zones and defunct processes bypassing the LWP rctl

2007-11-09 Thread Jerry Jelinek
Gael wrote:
> On 11/9/07, Jerry Jelinek <[EMAIL PROTECTED]> wrote:
>> Gael wrote:
>>> I had the bad surprise to find a production zone impacting a whole frame
>>> this morning... visibly the third party application running in it as
>> root
>>> (no comments) generated so many processes that the whole frame was
>>> generating a lot of cannot fork errors... impacting the other zones and
>> the
>>> GZ ... The LWPs are limited to 500 in that zone (frame is a E2900 with
>> 12
>>> cpus), but troubleshooting showed that the defunct process didn't get
>>> attached to LWPs and therefore didn't hit the wall ... Is there any plan
>> to
>>> allow some kind of limiting processes thru the GZ (the application
>> running
>>> as root, I do not know how to project it) ?
>>> 
>>> 
>>> 
>>>
>>> This is a scaring issue ...
>> Do you have FSS configured as the default scheduling class on
>> the system?  By itself the max-lwps rctl cannot control this
>> situation but when used in conjunction with FSS things should
>> be fine.  Except for rare cases I would say that you should always
>> be using FSS when you are using zones and sharing all of the
>> system resources amongst the zones.  If you are using pools
>> instead then that recommendation wouldn't apply.
>>
>> Jerry
>>
> 
> I'm using FSS as much as possible and positively in that one case, pools are
> only used when we need to mask the real amount of  cpus to applications
> using commands like psrinfo or else to determine how many cpus are present
> in the system.

Since you are using FSS it sounds like you might be hitting the issue
discussed on the following thread:

http://www.opensolaris.org/jive/thread.jspa?threadID=41108&tstart=60

Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zones and defunct processes bypassing the LWP rctl

2007-11-09 Thread Gael
On 11/9/07, Jerry Jelinek <[EMAIL PROTECTED]> wrote:
>
> Gael wrote:
> > I had the bad surprise to find a production zone impacting a whole frame
> > this morning... visibly the third party application running in it as
> root
> > (no comments) generated so many processes that the whole frame was
> > generating a lot of cannot fork errors... impacting the other zones and
> the
> > GZ ... The LWPs are limited to 500 in that zone (frame is a E2900 with
> 12
> > cpus), but troubleshooting showed that the defunct process didn't get
> > attached to LWPs and therefore didn't hit the wall ... Is there any plan
> to
> > allow some kind of limiting processes thru the GZ (the application
> running
> > as root, I do not know how to project it) ?
> > 
> > 
> > 
> >
> > This is a scaring issue ...
>
> Do you have FSS configured as the default scheduling class on
> the system?  By itself the max-lwps rctl cannot control this
> situation but when used in conjunction with FSS things should
> be fine.  Except for rare cases I would say that you should always
> be using FSS when you are using zones and sharing all of the
> system resources amongst the zones.  If you are using pools
> instead then that recommendation wouldn't apply.
>
> Jerry
>

I'm using FSS as much as possible and positively in that one case, pools are
only used when we need to mask the real amount of  cpus to applications
using commands like psrinfo or else to determine how many cpus are present
in the system.

Regards

-- 
Gael Martinez
___
zones-discuss mailing list
zones-discuss@opensolaris.org

Re: [zones-discuss] Zones and defunct processes bypassing the LWP rctl

2007-11-09 Thread Jerry Jelinek
Gael wrote:
> I had the bad surprise to find a production zone impacting a whole frame
> this morning... visibly the third party application running in it as root
> (no comments) generated so many processes that the whole frame was
> generating a lot of cannot fork errors... impacting the other zones and the
> GZ ... The LWPs are limited to 500 in that zone (frame is a E2900 with 12
> cpus), but troubleshooting showed that the defunct process didn't get
> attached to LWPs and therefore didn't hit the wall ... Is there any plan to
> allow some kind of limiting processes thru the GZ (the application running
> as root, I do not know how to project it) ?
> 
> 
> 
> 
> This is a scaring issue ...

Do you have FSS configured as the default scheduling class on
the system?  By itself the max-lwps rctl cannot control this
situation but when used in conjunction with FSS things should
be fine.  Except for rare cases I would say that you should always
be using FSS when you are using zones and sharing all of the
system resources amongst the zones.  If you are using pools
instead then that recommendation wouldn't apply.

Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org