Ah, this time the typo is too confusing - let me correct it:

-> Normally _we_ set the job environment's limit in the queue config.

I originally wrote "we", and then changed it to "Grid Engine admin",
and I wanted to change it back to "we" again - and I then worked on
something else :-D

Rayson





On Tue, May 29, 2012 at 10:13 AM, Rayson Ho <[email protected]> wrote:
> Hi Robert,
>
> Normally set the job environment's limit in the queue config, however
> the file descriptor limit is not part of the queue limit. I believe
> 4096 comes from the execd's environment, and it gets inherited by the
> shepherd, and then the job.
>
> You have 2 options:
>
> 1) Change the environment of the execd - ie. when execd starts, make
> sure there's enough descriptor limit in the shell (or the init
> environment).
>
> 2) You can set S_DESCRIPTORS, H_DESCRIPTORS with the "execd_params"
> option in sge_conf:
>
> http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_conf.html
>
> Example: H_DESCRIPTORS=5000
>
> Rayson
>
>
>
> On Tue, May 29, 2012 at 9:34 AM, Robert Hutton
> <[email protected]> wrote:
>> Hi All,
>>
>> We have some jobs that we'd like to run that need to open about 5000 files 
>> simultaneously.  When they're run outside the
>> grid they run fine but when run with qsub they fail.  I've run ulimit -H -n 
>> inside and outside to see why this might be.
>>  On our head node (foxtrot):
>>
>> $ cat test.sh
>> #!/bin/bash
>> ulimit -H -n
>> $ ./test.sh
>> 65536
>> $ qsub -q longrun.q@foxtrot -cwd -S /bin/bash ./test.sh
>> Your job 215227 ("test.sh") has been submitted
>> $ cat test.sh.o215227
>> 4096
>> $ grep nofile /etc/security/limits.conf
>> #        - nofile - max number of open files
>> *               soft    nofile          16384
>> *               hard    nofile          65536
>>
>> So the above shows that there is a limit of 4096 being set by GridEngine on 
>> the max number of open files, but I haven't
>> been able to find where this is being set in order to change it.  Can anyone 
>> point me in the right direction?  I'm
>> running Ubuntu 12.04 with 6.2u5-4 from the Ubuntu repos and the Open Grid 
>> Scheduler hwloc drop-in upgrade[1].
>>
>> Thanks,
>>
>> Rob
>>
>> [1] http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
>>
>> --
>> Robert Hutton
>> Senior Systems and Database Administrator
>> Centre for Genomics and Global Health <http://cggh.org>
>> The Wellcome Trust Centre for Human Genetics
>> Roosevelt Drive
>> Oxford
>> OX3 7BN
>> United Kingdom
>> Tel: +44 (0)1865 287721
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to