Specifying --mem to Slurm only tells it to find a node that has that much, not to enforce a limit as far as I know. That node has that much so it finds it. You probably want to enable UsePAM and setup the pam.d slurm files and /etc/security/limits.conf to keep users under the 64000MB physical memory that the node has (minus some padding for the OS, etc.). IS UsePAM enabled in your slurm.conf, maybe that’s doing it.
Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu | Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 4/15/18, 2:28 PM, "slurm-users on behalf of Mahmood Naderan" <slurm-users-boun...@lists.schedmd.com on behalf of mahmood...@gmail.com> wrote: Bill, Thing is that both user and root see unlimited virtual memory when they directly ssh to the node. However, when the job is submitted, the user limits change. That means, slurm modifies something. The script is #SBATCH --job-name=hvacSteadyFoam #SBATCH --output=hvacSteadyFoam.log #SBATCH --ntasks=32 #SBATCH --time=100:00:00 #SBATCH --mem=64000M ulimit -a mpirun hvacSteadyFoam -parallel The physical memory on the node is 64GB, therefore, I specified 64000M for --mem. Is that correct? the only thing I am guessing is that --mem also modifies virtual memory limit. Though I am not sure. Regards, Mahmood On Sun, Apr 15, 2018 at 11:32 PM, Bill Barth <bba...@tacc.utexas.edu> wrote: > Mahmood, sorry to presume. I meant to address the root user and your ssh to the node in your example. > > At our site, we use UsePAM=1 in our slurm.conf, and our /etc/pam.d/slurm and slurm.pam files both contain pam_limits.so, so it could be that way for you, too. I.e. Slurm could be setting the limits for jobscripts for your users, but for root SSHes, where that’s being set by PAM through another config file. Also, root’s limits are potentially differently set by PAM (in /etc/security/limits.conf) or the kernel at boot time. > > Finally, users should be careful using ulimit in their job scripts b/c that can only change the limits for that shell script process and not across nodes. That jobscript appears to only apply to one node, but if they want different limits for jobs that span nodes, they may need to use other features of SLURM to get them across all the nodes their job wants (cgroups, perhaps?). > > Best, > Bill. > > -- > Bill Barth, Ph.D., Director, HPC > bba...@tacc.utexas.edu | Phone: (512) 232-7069 > Office: ROC 1.435 | Fax: (512) 475-9445 >