Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf
> Am 16.07.2019 um 02:33 schrieb Derrick Lin : > > Thanks guys, > > >> Correct. The limits in place when sgeexecd is started are used (i.e. the > >> one of the root user). > I tried to simply restart the sgeexecd but it does not change anything. > > In my /etc/security/limits.conf I have: > * soft nofile 18000 > * hard nofile 2 > > That should apply to every account? the SGE daemons are run under user "sge". The appear to run under sge, but it runs under to root account (and should be started by root): $ ps -e f -o user,ruser,command … sgeadmin root /usr/sge/bin/lx24-em64t/sge_qmaster > >> Several ulimits can be set in the queue configuration, and can so > >> different for each queue or exechost. > > We don't have any ulimits setting inside queue or other SGE parts, > limits.conf is the only place of the config. > > It is so weird that most of the Compute Nodes pick up the settings correctly, > only a few fail to pick up. Do you log in in by SSH to the node? Then you have to restart the SSH daemon too, as the login process inherits the values the SSH daemon got. The changes of the "nofile" setting should be visible in the shell when you log in too. -- Reuti > Currently, my only workaround is to rebuild the Compute Node (reinstall OS > etc) so that it corrects this issue. > > >> Can you check the limits that are set in the sge_execd and sge_shepherd > processes (/proc//limits)? > > I tried to look it up, but I could not find the directory which is > corresponding to the sgeexecd. > > Cheers, > Derrick > > > On Thu, Jul 4, 2019 at 12:09 AM Skylar Thompson wrote: > Can you check the limits that are set in the sge_execd and sge_shepherd > processes (/proc//limits)? It's possible that the user who ran the > execd init script had limits applied, which would carry over to the execd > process. > > On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote: > > Hi guys, > > > > We have custom settings for user open files in /etc/security/limits.conf in > > all Compute Node. When checking if the configuration is effective with > > "ulimit -a" by SSH to each node, it reflects the correct settings. > > > > but when ran the same command through SGE (both qsub and qrsh), we found > > that some Compute Nodes do not reflects the correct settings but the rest > > are fine. > > > > I am wondering if this is SGE related? And idea is welcomed. > > > > Cheers, > > Derrick > > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > -- > -- Skylar Thompson (skyl...@u.washington.edu) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf
Thanks guys, >> Correct. The limits in place when sgeexecd is started are used (i.e. the one of the root user). I tried to simply restart the sgeexecd but it does not change anything. In my /etc/security/limits.conf I have: * soft nofile 18000 * hard nofile 2 That should apply to every account? the SGE daemons are run under user "sge". >> Several ulimits can be set in the queue configuration, and can so different for each queue or exechost. We don't have any ulimits setting inside queue or other SGE parts, limits.conf is the only place of the config. It is so weird that most of the Compute Nodes pick up the settings correctly, only a few fail to pick up. Currently, my only workaround is to rebuild the Compute Node (reinstall OS etc) so that it corrects this issue. >> Can you check the limits that are set in the sge_execd and sge_shepherd processes (/proc//limits)? I tried to look it up, but I could not find the directory which is corresponding to the sgeexecd. Cheers, Derrick On Thu, Jul 4, 2019 at 12:09 AM Skylar Thompson wrote: > Can you check the limits that are set in the sge_execd and sge_shepherd > processes (/proc//limits)? It's possible that the user who ran the > execd init script had limits applied, which would carry over to the execd > process. > > On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote: > > Hi guys, > > > > We have custom settings for user open files in /etc/security/limits.conf > in > > all Compute Node. When checking if the configuration is effective with > > "ulimit -a" by SSH to each node, it reflects the correct settings. > > > > but when ran the same command through SGE (both qsub and qrsh), we found > > that some Compute Nodes do not reflects the correct settings but the rest > > are fine. > > > > I am wondering if this is SGE related? And idea is welcomed. > > > > Cheers, > > Derrick > > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > -- > -- Skylar Thompson (skyl...@u.washington.edu) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf
Can you check the limits that are set in the sge_execd and sge_shepherd processes (/proc//limits)? It's possible that the user who ran the execd init script had limits applied, which would carry over to the execd process. On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote: > Hi guys, > > We have custom settings for user open files in /etc/security/limits.conf in > all Compute Node. When checking if the configuration is effective with > "ulimit -a" by SSH to each node, it reflects the correct settings. > > but when ran the same command through SGE (both qsub and qrsh), we found > that some Compute Nodes do not reflects the correct settings but the rest > are fine. > > I am wondering if this is SGE related? And idea is welcomed. > > Cheers, > Derrick > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf
Hi, > Am 03.07.2019 um 04:39 schrieb Daniel Povey : > > Could it relate to when the daemons were started on those nodes? I'm not > sure exactly at what point those limits are applied, and how they are > inherited by child processes. Correct. The limits in place when sgeexecd is started are used (i.e. the one of the root user). > If you changed those files recently it might not have taken effect. > > On Tue, Jul 2, 2019 at 10:36 PM Derrick Lin wrote: > Hi guys, > > We have custom settings for user open files in /etc/security/limits.conf in > all Compute Node. When checking if the configuration is effective with > "ulimit -a" by SSH to each node, it reflects the correct settings. > > but when ran the same command through SGE (both qsub and qrsh), we found that > some Compute Nodes do not reflects the correct settings but the rest are fine. Several ulimits can be set in the queue configuration, and can so different for each queue or exechost. -- Reuti > I am wondering if this is SGE related? And idea is welcomed. > > Cheers, > Derrick > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf
Could it relate to when the daemons were started on those nodes? I'm not sure exactly at what point those limits are applied, and how they are inherited by child processes. If you changed those files recently it might not have taken effect. On Tue, Jul 2, 2019 at 10:36 PM Derrick Lin wrote: > Hi guys, > > We have custom settings for user open files in /etc/security/limits.conf > in all Compute Node. When checking if the configuration is effective with > "ulimit -a" by SSH to each node, it reflects the correct settings. > > but when ran the same command through SGE (both qsub and qrsh), we found > that some Compute Nodes do not reflects the correct settings but the rest > are fine. > > I am wondering if this is SGE related? And idea is welcomed. > > Cheers, > Derrick > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users