Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-16 Thread Reuti

> Am 16.07.2019 um 02:33 schrieb Derrick Lin :
> 
> Thanks guys,
> 
> >> Correct. The limits in place when sgeexecd is started are used (i.e. the 
> >> one of the root user).
> I tried to simply restart the sgeexecd but it does not change anything.
> 
> In my /etc/security/limits.conf I have:
> * soft nofile 18000
> * hard nofile 2
> 
> That should apply to every account? the SGE daemons are run under user "sge".

The appear to run under sge, but it runs under to root account (and should be 
started by root):

$ ps -e f -o user,ruser,command
…
sgeadmin root /usr/sge/bin/lx24-em64t/sge_qmaster


> >> Several ulimits can be set in the queue configuration, and can so 
> >> different for each queue or exechost.
> 
> We don't have any ulimits setting inside queue or other SGE parts, 
> limits.conf is the only place of the config. 
> 
> It is so weird that most of the Compute Nodes pick up the settings correctly, 
> only a few fail to pick up.

Do you log in in by SSH to the node? Then you have to restart the SSH daemon 
too, as the login process inherits the values the SSH daemon got.

The changes of the "nofile" setting should be visible in the shell when you log 
in too.

-- Reuti


> Currently, my only workaround is to rebuild the Compute Node (reinstall OS 
> etc) so that it corrects this issue.
> 
> >> Can you check the limits that are set in the sge_execd and sge_shepherd
> processes (/proc//limits)?
> 
> I tried to look it up, but I could not find the  directory which is 
> corresponding to the sgeexecd.
> 
> Cheers,
> Derrick 
> 
> 
> On Thu, Jul 4, 2019 at 12:09 AM Skylar Thompson  wrote:
> Can you check the limits that are set in the sge_execd and sge_shepherd
> processes (/proc//limits)? It's possible that the user who ran the
> execd init script had limits applied, which would carry over to the execd
> process.
> 
> On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote:
> > Hi guys,
> > 
> > We have custom settings for user open files in /etc/security/limits.conf in
> > all Compute Node. When checking if the configuration is effective with
> > "ulimit -a" by SSH to each node, it reflects the correct settings.
> > 
> > but when ran the same command through SGE (both qsub and qrsh), we found
> > that some Compute Nodes do not reflects the correct settings but the rest
> > are fine.
> > 
> > I am wondering if this is SGE related? And idea is welcomed.
> > 
> > Cheers,
> > Derrick
> 
> > ___
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users
> 
> 
> -- 
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-15 Thread Derrick Lin
Thanks guys,

>> Correct. The limits in place when sgeexecd is started are used (i.e. the
one of the root user).
I tried to simply restart the sgeexecd but it does not change anything.

In my /etc/security/limits.conf I have:
* soft nofile 18000
* hard nofile 2

That should apply to every account? the SGE daemons are run under user
"sge".

>> Several ulimits can be set in the queue configuration, and can so
different for each queue or exechost.

We don't have any ulimits setting inside queue or other SGE parts,
limits.conf is the only place of the config.

It is so weird that most of the Compute Nodes pick up the settings
correctly, only a few fail to pick up.

Currently, my only workaround is to rebuild the Compute Node (reinstall OS
etc) so that it corrects this issue.

>> Can you check the limits that are set in the sge_execd and sge_shepherd
processes (/proc//limits)?

I tried to look it up, but I could not find the  directory which is
corresponding to the sgeexecd.

Cheers,
Derrick


On Thu, Jul 4, 2019 at 12:09 AM Skylar Thompson  wrote:

> Can you check the limits that are set in the sge_execd and sge_shepherd
> processes (/proc//limits)? It's possible that the user who ran the
> execd init script had limits applied, which would carry over to the execd
> process.
>
> On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote:
> > Hi guys,
> >
> > We have custom settings for user open files in /etc/security/limits.conf
> in
> > all Compute Node. When checking if the configuration is effective with
> > "ulimit -a" by SSH to each node, it reflects the correct settings.
> >
> > but when ran the same command through SGE (both qsub and qrsh), we found
> > that some Compute Nodes do not reflects the correct settings but the rest
> > are fine.
> >
> > I am wondering if this is SGE related? And idea is welcomed.
> >
> > Cheers,
> > Derrick
>
> > ___
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users
>
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-03 Thread Skylar Thompson
Can you check the limits that are set in the sge_execd and sge_shepherd
processes (/proc//limits)? It's possible that the user who ran the
execd init script had limits applied, which would carry over to the execd
process.

On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote:
> Hi guys,
> 
> We have custom settings for user open files in /etc/security/limits.conf in
> all Compute Node. When checking if the configuration is effective with
> "ulimit -a" by SSH to each node, it reflects the correct settings.
> 
> but when ran the same command through SGE (both qsub and qrsh), we found
> that some Compute Nodes do not reflects the correct settings but the rest
> are fine.
> 
> I am wondering if this is SGE related? And idea is welcomed.
> 
> Cheers,
> Derrick

> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-02 Thread Reuti
Hi,

> Am 03.07.2019 um 04:39 schrieb Daniel Povey :
> 
> Could it relate to when the daemons were started on those nodes?  I'm not 
> sure exactly at what point those limits are applied, and how they are 
> inherited by child processes.

Correct. The limits in place when sgeexecd is started are used (i.e. the one of 
the root user).


>  If you changed those files recently it might not have taken effect.
> 
> On Tue, Jul 2, 2019 at 10:36 PM Derrick Lin  wrote:
> Hi guys,
> 
> We have custom settings for user open files in /etc/security/limits.conf in 
> all Compute Node. When checking if the configuration is effective with 
> "ulimit -a" by SSH to each node, it reflects the correct settings.
> 
> but when ran the same command through SGE (both qsub and qrsh), we found that 
> some Compute Nodes do not reflects the correct settings but the rest are fine.

Several ulimits can be set in the queue configuration, and can so different for 
each queue or exechost.

-- Reuti


> I am wondering if this is SGE related? And idea is welcomed.
> 
> Cheers,
> Derrick
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-02 Thread Daniel Povey
Could it relate to when the daemons were started on those nodes?  I'm not
sure exactly at what point those limits are applied, and how they are
inherited by child processes.  If you changed those files recently it might
not have taken effect.

On Tue, Jul 2, 2019 at 10:36 PM Derrick Lin  wrote:

> Hi guys,
>
> We have custom settings for user open files in /etc/security/limits.conf
> in all Compute Node. When checking if the configuration is effective with
> "ulimit -a" by SSH to each node, it reflects the correct settings.
>
> but when ran the same command through SGE (both qsub and qrsh), we found
> that some Compute Nodes do not reflects the correct settings but the rest
> are fine.
>
> I am wondering if this is SGE related? And idea is welcomed.
>
> Cheers,
> Derrick
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
>
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users