perhaps this will help:
http://slurm.schedmd.com/faq.html#rlimit

Quoting Всеволод Никоноров <[email protected]>:

Hi,

I am seeing mentioned warning now and then. As I discovered, there is a configuration parameter in slurm.conf, PropagateResourceLimitsExcept, which determines whether system rlimit_memlock value is propagated to submitted job or not. I have this parameter set to "NOFILE", meaning that it should propagate system value of rlimit_memlock (there are lines "* soft memlock unlimited" and "* hard memlock unlimited" in my /etc/security/limits.conf, which, as I believe, specifies rlimit_memlock value as unlimited). Nevertheless, sometimes my users complain about mentioned warning spotting in their jobs' output files. I use the followving way to check whether the node where user's job has failed is configured properly:

srun -w <node name> bash -c "ulimit -l"

Sometimes I get "32" as the result. Restarting slurmd on the node fixes the issue. Is this problem known for slurm-14.03.7?

Thanks in advance!


--
Morris "Moe" Jette
CTO, SchedMD LLC

Reply via email to