The attached patch should fix the memory leak. This occurs only with
select/cons_res, sched/backfill, and generic resources configured.
The change will be in SLURM version 2.2.5. Please verify that it fixes
your memory leak.
Index: src/plugins/select/cons_res/select_cons_res.c
===================================================================
--- src/plugins/select/cons_res/select_cons_res.c (revision 23019)
+++ src/plugins/select/cons_res/select_cons_res.c (working copy)
@@ -493,10 +493,15 @@
static void _destroy_node_data(struct node_use_record *node_usage,
struct node_res_record *node_data)
{
+ int i;
+
xfree(node_data);
if (node_usage) {
- if (node_usage->gres_list)
- list_destroy(node_usage->gres_list);
+ for (i = 0; i < select_node_cnt; i++) {
+ if (node_usage[i].gres_list) {
+ list_destroy(node_usage[i].gres_list);
+ }
+ }
xfree(node_usage);
}
}
________________________________________
From: [email protected] [[email protected]] On Behalf
Of Tommi T [[email protected]]
Sent: Tuesday, April 05, 2011 12:33 AM
To: [email protected]
Subject: [slurm-dev] Gres-plugin leaks memory?
Hello
After configuring gres-plugin, slurmctld started to leak memory a lot:
==25614== 29,259,064 (12,594,792 direct, 16,664,272 indirect) bytes in 1,229
blocks are definitely lost in loss record 369 of 369
==25614== at 0x4C20E1C: malloc (vg_replace_malloc.c:195)
==25614== by 0x478ADC: slurm_xmalloc (xmalloc.c:94)
==25614== by 0x47C25E: list_alloc_aux (list.c:964)
==25614== by 0x47C951: list_create (list.c:905)
==25614== by 0x4FBC3C: gres_plugin_node_state_dup (gres.c:1709)
==25614== by 0x65B950A: _dup_node_usage (select_cons_res.c:401)
Here are config files and full valgrind log
http://www.puuppa.org/~teve/slurm-current.conf
http://www.puuppa.org/~teve/val2.out.bz2
http://www.puuppa.org/~teve/gres.conf
TIA,
Tommi
Index: src/plugins/select/cons_res/select_cons_res.c
===================================================================
--- src/plugins/select/cons_res/select_cons_res.c (revision 23019)
+++ src/plugins/select/cons_res/select_cons_res.c (working copy)
@@ -493,10 +493,15 @@
static void _destroy_node_data(struct node_use_record *node_usage,
struct node_res_record *node_data)
{
+ int i;
+
xfree(node_data);
if (node_usage) {
- if (node_usage->gres_list)
- list_destroy(node_usage->gres_list);
+ for (i = 0; i < select_node_cnt; i++) {
+ if (node_usage[i].gres_list) {
+ list_destroy(node_usage[i].gres_list);
+ }
+ }
xfree(node_usage);
}
}