I have encountered that slurmctld uses more than 20GB of virtual memory. But the RSS is less than 1GB. I am not sure whether this is OK or there is some leakage.
在 2013-06-25二的 11:56 -0700,Mario Kadastik写道: > Hi, > > is it normal for slurmctld to consume in excess of 10GB of ram? I had > original slurm controller VM created with 2GB of ram, that caused at times > slurm to die due to OOM killer. I increased it to 6GB and we could live a > little longer until now I had to increase it to 10GB because crashes still > occurred and I now just witnessed the first 10GB OOM kill of slurm > controller. > > We're running 2.5.3. The last 1000 lines of log before crash are here: > http://cms.hep.kbfi.ee/~mario/dbg/slurmctld-preOOM.log > > The OOM kill: > Jun 25 18:21:32 slurm-1 kernel: [5463683.553994] OOM killed process 5070 > (slurmdbd) vm:269284kB, rss:10312kB, swap:628kB > Jun 25 18:21:32 slurm-1 kernel: [5463683.909668] OOM killed process 802 > (slurmctld) vm:11688184kB, rss:10409300kB, swap:241096kB > > The config file: http://cms.hep.kbfi.ee/~mario/dbg/slurm.conf > > We have 167 workernodes with a total of ~5000 compute cores. Do we really > need to give slurm far more RAM or is that amount unreasonable and points > more likely to a memory leak? > > Mario Kadastik, PhD > Researcher > > --- > "Physics is like sex, sure it may have practical reasons, but that's not > why we do it" > -- Richard P. Feynman
