We use the valgrind tool to test all slurm daemons for memory leaks  
with a variety of configurations. See if you can identify the source  
of leaks.Iinstructions in src/slurmctld/controller.c:

/**************************************************************************\
  * To test for memory leaks, set MEMORY_LEAK_DEBUG to 1 using
  * "configure --enable-memory-leak-debug" then execute
  * $ valgrind --tool=memcheck --leak-check=yes --num-callers=8 \
  *   --leak-resolution=med ./slurmctld -Dc >valg.ctld.out 2>&1
  *
  * Then exercise the slurmctld functionality before executing
  * > scontrol shutdown




Quoting Mario Kadastik <[email protected]>:

>
>> I have encountered that slurmctld uses more than 20GB of virtual memory.
>> But the RSS is less than 1GB. I am not sure whether this is OK or there
>> is some leakage.
>>
>> 在 2013-06-25二的 11:56 -0700,Mario Kadastik写道:
>>> The OOM kill:
>>> Jun 25 18:21:32 slurm-1 kernel: [5463683.553994] OOM killed  
>>> process 5070 (slurmdbd) vm:269284kB, rss:10312kB, swap:628kB
>>> Jun 25 18:21:32 slurm-1 kernel: [5463683.909668] OOM killed  
>>> process 802 (slurmctld) vm:11688184kB, rss:10409300kB, swap:241096kB
>
>
> As you can see in my case the RSS was 10GB and that was the cause  
> for the kill. The VM was 11GB. But maybe I should increase the  
> virtual machines VM size, maybe that'd keep the RSS down a bit, but  
> I doubt this is the case. If there have been memory leak  
> improvements with regard to 2.5.3 to current release, then I could  
> upgrade, but I'd really like to know that this is a known effect as  
> this is a production system.
>
> Thanks,
>
> Mario Kadastik, PhD
> Researcher
>
> ---
>   "Physics is like sex, sure it may have practical reasons, but  
> that's not why we do it"
>      -- Richard P. Feynman
>

Reply via email to