I don't see any problems with your configuration. We use valgrind to test for memory leaks using a variety of SLURM configurations, although it is not possible to test all configurations. It would be great if you could run the slurmctld under valgrind and check for leaks.
1. Run configure with --enable-memory-leak-debug 2. Start slurmctld under valgrind: valgrind --tool=memcheck --leak-check=yes --num-callers=6 --leak-resolution=med slurmctld -D >val.out 2>&1 3. After a while, shut it down scontrol shutdown 4. Restart the slurm daemons normally 5. Check the end of val.out for a memory leak report. ________________________________________ From: [email protected] [[email protected]] On Behalf Of Phil Sharfstein [[email protected]] Sent: Wednesday, March 16, 2011 1:26 PM To: [email protected] Subject: [slurm-dev] slurmctld high memory utilization The slurmctld process on my primary control machine is using over 90% of the available memory (16GB). After restarting slurmctld, its memory utilization is only a few percent. However, within 24 hours, it is consuming over 90% of the memory. Our slurm version is 2.2.0 running on RHEL 5.6. We are using backfill scheduling and cons_res select. Our jobs are all submitted with unlimited time limits and primarily use generic resources and licenses for resource allocation. We have one long-running process using the master resource on each of the nodes that launches a number of parallel slave processes that are scheduled one on each node. We will generally have 40 running master processes 50-100 pending master processes, 40 running slave processes and 500+ pending slave processes. Slave processes are prioritized (nice value) to ensure that those scheduled by the first launched master processes jump to the front of the queue (master jobs finish in the order they were launched in the shortest amount of time). A master process runs for 1+ hours (some finish 24+ hours after launch waiting for resources to complete their slave jobs), while a single slave processes generally completes in 5-20 minutes. I'm pretty sure that we are doing something wrong with our configuration or conops that is causing the excess memory consumption. However, I have not been able to track it down. Thanks, -Phil Our slurm.conf (excuse any typos- this was transcribed from a printout): ControlMachine=blade0204 ControlAddr=10.1.53.49 BackupController=blade0201 BackupAddr=10.1.53.146 AuthType=auth/munge CacheGroups=1 CryptoType=crypto/munge GresTypes=master,slave Licenses=fcx*3,obc*6 MaxJobCount=3000 MpiDefault=none ProctrackType=proctrack/pgid ReturnToService=1 SlurmctldPidFile=/var/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/var/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/tmp/slurmd SlurmUser=bin StateSaveLocation=/gpfs/fs0/slurm SwitchType=switch/none TaskPlugin=task/none HealthCheckInterval=60 HealthCheckProgram=/etc/slurm/healthcheck.sh InactiveLimit=0 KillWait=30 MessageTimeout=90 MinJobAge=10 SlurmctldTimeout=90 SlurmdTimeout=300 Waittime=0 FastSchedule=1 SchedulerType=sched/backfill SchedulerParameters=max_job_bf=1000 SchedulerPort=7321 SelectType=select/cons_res AccountingStorageType=accounting_storage/none ClusterName=cluster JobCompType=jobcomp/none JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/none SlurmctldDebug=3 SlurmdDebug=3 NodeName=blade02[01-16] NodeAddr=10.1.153.[146-161] Procs=8 RealMemory=1600 Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN Gres=master:1,slave:1 NodeName=blade03[01-16] NodeAddr=10.1.153.[162-177] Procs=8 RealMemory=1600 Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN Gres=master:1,slave:1 NodeName=blade04[01-16] NodeAddr=10.1.153.[178-193] Procs=8 RealMemory=1600 Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN Gres=master:1,slave:1 PartitionName=clust Nodes=blade02[09-16],blade03[01-16],blade04[01-16] Default=YES MaxTime=INFINITE State=UP PartitionName=clusttest Nodes=blade02[01-09] Default=NO MaxTime=INFINITE State=UP
