Riccardo,

Your configuration is very close to ours, and this is an issue we're facing 
too.  The VSizeFactor=101 is also what we use, and is something we may set back 
to "0" because of issues with how SLURM treats memory.  We are on 14.03.6 and 
so far it seems that SLURM does not handle memory (for scheduling, preemption, 
etc) as well as it does for CPUs.  For example, we use swap on our compute 
nodes to handle jobs being preempted via SUSPEND, 
but SLURM would only look at the node's available memory (ignoring swap).  
We've had to hack the code (still cleaning up for proper pull request) to add 
an option to the scheduling parameters for "assume_swap".  This allows 
Preemption of SUSPEND (using partition preemption) even if a node has all its 
memory allocated.

Below is our config.

You may try using the cgroup ProctrackType.  Are you using ConstrainRAMSpace in 
cgroup.conf?  I still plan to experiment with ConstrainSwapSpace=yes and 
setting our previous value for VSizeFactor to be used in MaxSwapPercent.

- Trey

# slurm.conf
JobAcctGatherType=jobacct_gather/linux
JobCompType=jobcomp/none
MpiDefault=none
ProctrackType=proctrack/cgroup
PropagateResourceLimits=NONE
SchedulerParameters=assume_swap # our local hack
SchedulerTimeSlice=30
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory,CR_CORE_DEFAULT_DIST_BLOCK
TaskPlugin=task/cgroup
TaskPluginParam=Sched
VSizeFactor=101

# cgroup.conf
CgroupMountpoint=/cgroup
CgroupAutomount=yes
CgroupReleaseAgentDir="/home/slurm/cgroup"

ConstrainCores=yes
TaskAffinity=yes
AllowedRAMSpace=100
AllowedSwapSpace=0
ConstrainRAMSpace=yes
ConstrainSwapSpace=no
MaxRAMPercent=100
MaxSwapPercent=100
MinRAMSpace=30
ConstrainDevices=no
AllowedDevicesFile=/home/slurm/conf/cgroup_allowed_devices_file.conf



=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email: [email protected] 
Jabber: [email protected]

----- Original Message -----
> From: "Riccardo Murri" <[email protected]>
> To: "slurm-dev" <[email protected]>
> Sent: Friday, September 19, 2014 11:02:15 AM
> Subject: [slurm-dev] overcounting of SysV shared memory segments?
> 
> 
> Hello,
> 
> we are having an issue with SLURM killing jobs because of virtual
> memory limits::
> 
>     slurmstepd[46530]: error: Job 784 exceeded virtual memory limit
> (416329820 > 211812352), being killed
> 
> The problem is that the job above has actually negligible heap use,
> *but* it allocates a SysV shared memory segment of about 100GB.  It
> seems that the size of this shared memory segment is counted towards
> *all* 4 processes in the job, instead of being counted just once.
> 
> Is this expected, or did we misconfigure something?
> 
> We are running 14.03.2. Possibly relevant configuration items::
> 
>     # slurm.conf
>     JobAcctGatherType=jobacct_gather/linux
>     JobCompType=jobcomp/none
>     MpiDefault=none
>     ProctrackType=proctrack/pgid
>     PropagateResourceLimitsExcept=CPU
>     SelectType=select/cons_res
>     SelectTypeParameters=CR_Core_Memory
>     TaskPlugin=task/cgroup
>     VSizeFactor=101
> 
>     # cgroup.conf
>     ConstrainCores=yes
> 
> Thanks for any suggestion!
> 
> Kind regards,
> Riccardo
> 
> --
> Riccardo Murri
> http://www.s3it.uzh.ch/about/team/
> 
> S3IT: Services and Support for Science IT
> University of Zurich
> Winterthurerstrasse 190, CH-8057 Zürich (Switzerland)
> Tel: +41 44 635 4222
> Fax: +41 44 635 6888

Reply via email to