Am 03.06.2011 um 15:46 schrieb John Young: > One of the engineers here is having problems with any job > that tries to use more than 1024 cores. His csh script is > getting a 'Too many open files' error, so I tried raising > the descriptors limit in the shell from 1024 to 65535. > That seems to have worked for interactive logins, but not > for gridengine jobs. > > If I ssh to one of the client nodes and issue a 'limit' > command, I get: > > % ssh compute-1-6 limit > cputime unlimited > filesize unlimited > datasize unlimited > stacksize 10240 kbytes > coredumpsize 0 kbytes > memoryuse unlimited > vmemoryuse unlimited > descriptors 65535 > memorylocked 32 kbytes > maxproc 131072 > > but if I submit a script that contains: > > # > limit > # > echo 'cat /proc/sys/fs/file-max' > cat /proc/sys/fs/file-max > # > > I get (from the same client as above) in the logfile: > > cputime unlimited > filesize unlimited > datasize unlimited > stacksize unlimited > coredumpsize 0 kbytes > memoryuse unlimited > vmemoryuse unlimited > descriptors 1024 > memorylocked 32 kbytes > maxproc 524288 > cat /proc/sys/fs/file-max > 6448170 > > Please note that 'descriptors' is still showing 1024 instead > of 65535. Any idea where that is coming from? Why is gridengine > using a different value than the one that I get when I just ssh > into a node?
It's the value which was set when the execd was started. But you can define them in SGE's configuration to override it. Please have a look at `man sge_conf` section "execd_params" where you can define "S_DESCRIPTORS", "H_DESCRIPTORS". -- Reuti > Any suggestions? > > JY > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
