"Can't propagate RLIMIT_...". I don't see such messages in my slurm.log. Thanks anyway.
Maybe there is an error in that article: ---citation--- When the srun command executes, it captures the resource limits in effect at submit time. These limits are propagated to the allocated _nodes_ before initiating the user's job. The SLURM daemon running on _that node_ then tries to establish identical resource limits for the job being initiated. ---end of citation--- In second sentence there are _nodes_, more than one. In third sentence they are referred to as "that node". 13.10.2014, 23:48, "[email protected]" <[email protected]>: > perhaps this will help: > http://slurm.schedmd.com/faq.html#rlimit > > Quoting О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫ О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫ > <[email protected]>: >> О©╫Hi, >> >> О©╫I am seeing mentioned warning now and then. As I discovered, there >> О©╫is a configuration parameter in slurm.conf, >> О©╫PropagateResourceLimitsExcept, which determines whether system >> О©╫rlimit_memlock value is propagated to submitted job or not. I have >> О©╫this parameter set to "NOFILE", meaning that it should propagate >> О©╫system value of rlimit_memlock (there are lines "* soft memlock >> О©╫unlimited" and "* hard memlock unlimited" in my >> О©╫/etc/security/limits.conf, which, as I believe, specifies >> О©╫rlimit_memlock value as unlimited). Nevertheless, sometimes my users >> О©╫complain about mentioned warning spotting in their jobs' output >> О©╫files. I use the followving way to check whether the node where >> О©╫user's job has failed is configured properly: >> >> О©╫srun -w <node name> bash -c "ulimit -l" >> >> О©╫Sometimes I get "32" as the result. Restarting slurmd on the node >> О©╫fixes the issue. Is this problem known for slurm-14.03.7? >> >> О©╫Thanks in advance! > > -- > Morris "Moe" Jette > CTO, SchedMD LLC --О©╫ Vsevolod Nikonorov, JSC NIKIET О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫ О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫, О©╫О©╫О©╫ О©╫О©╫О©╫О©╫О©╫О©╫
