I have a support request from a customer, which asks for a way to preserve
dynamically updated partition state information across an "scontrol
reconfigure" command. Currently, the state of nodes is preserved across
the "scontrol reconfig", but partitions are totally reset from the
slurm.conf information.
The dynamic partition state does persist across a SIGHUP to slurmctld, but
sending a SIGHUP via a "kill -s SIGHUP <slurmctld pid>" is rather awkward
and outside of SLURM. In addition, if a "reconfig" is requested for some
other reason such as changing the log file location or updating node
information, then the partitions are reconstructed and the current state
is lost.
I propose that a new option to the "slurm.conf" file, for example
"ReconfKeepPartState", could satisfy this request. The option would be
set to "0" by default to preserve the old behavior, but could be set to
"1" to request the new functionality of keeping/merging the partition
information on the "scontrol reconfig".
Here is the proposed addition to the "slurm.conf" man page to describe
this option:
ReconfKeepPartState
If set to YES, an "scontrol reconfig" command will maintain the
in-memory state of partitions that may have been dynamically
updated by "scontrol update". Partition information in the
slurm.conf file will be merged with in-memory data. The default
is NO, which will rebuild the partition information using only the
definitions in the slurm.conf file, whenever an "scontrol
reconfig" is done.
Essentially, the new option causes slurmctld to treat the partition update
as if SIGHUP had been requested instead of completely rebuilding the
partition information. Since this is an enhancement, the proposed patch
set below that implements this change is against the SLURM 2.4.0 version.
-Don Albert-
Index: s240rc3/slurm/contribs/perlapi/libslurm/perl/conf.c
===================================================================
RCS file: /cvsroot/slurm/slurm/contribs/perlapi/libslurm/perl/conf.c,v
retrieving revision 1.1.1.6
diff -u -r1.1.1.6 conf.c
--- s240rc3/slurm/contribs/perlapi/libslurm/perl/conf.c 12 Nov 2010 17:18:19
-0000 1.1.1.6
+++ s240rc3/slurm/contribs/perlapi/libslurm/perl/conf.c 16 Nov 2011 23:36:44
-0000
@@ -149,6 +149,7 @@
STORE_FIELD(hv, conf, propagate_rlimits, charp);
if(conf->propagate_rlimits_except)
STORE_FIELD(hv, conf, propagate_rlimits_except, charp);
+ STORE_FIELD(hv, conf, reconf_keep_part_state, uint16_t);
if(conf->resume_program)
STORE_FIELD(hv, conf, resume_program, charp);
STORE_FIELD(hv, conf, resume_rate, uint16_t);
@@ -340,6 +341,7 @@
FETCH_FIELD(hv, conf, propagate_prio_process, uint16_t, TRUE);
FETCH_FIELD(hv, conf, propagate_rlimits, charp, FALSE);
FETCH_FIELD(hv, conf, propagate_rlimits_except, charp, FALSE);
+ FETCH_FIELD(hv, conf, reconf_keep_part_state, uint16_t, TRUE);
FETCH_FIELD(hv, conf, resume_program, charp, FALSE);
FETCH_FIELD(hv, conf, resume_rate, uint16_t, TRUE);
FETCH_FIELD(hv, conf, resume_timeout, uint16_t, TRUE);
Index: s240rc3/slurm/doc/man/man5/slurm.conf.5
===================================================================
RCS file: /cvsroot/slurm/slurm/doc/man/man5/slurm.conf.5,v
retrieving revision 1.1.1.59.6.1
diff -u -r1.1.1.59.6.1 slurm.conf.5
--- s240rc3/slurm/doc/man/man5/slurm.conf.5 25 Oct 2011 17:57:25 -0000
1.1.1.59.6.1
+++ s240rc3/slurm/doc/man/man5/slurm.conf.5 16 Nov 2011 23:37:41 -0000
@@ -1242,6 +1242,15 @@
an authorized user. After being rebooting, the node is returned to normal use.
.TP
+\fBReconfKeepPartState\fR
+If set to YES, an "scontrol reconfig" command will maintain the
+in-memory state of partitions that may have been dynamically updated
+by "scontrol update". Partition information in the slurm.conf file
+will be merged with in-memory data. The default is NO, which will
+rebuild the partition information using only the definitions in the
+slurm.conf file, whenever an "scontrol reconfig" is done.
+
+.TP
\fBResumeProgram\fR
SLURM supports a mechanism to reduce power consumption on nodes that
remain idle for an extended period of time.
Index: s240rc3/slurm/slurm/slurm.h.in
===================================================================
RCS file: /cvsroot/slurm/slurm/slurm/slurm.h.in,v
retrieving revision 1.1.1.53.6.2
diff -u -r1.1.1.53.6.2 slurm.h.in
--- s240rc3/slurm/slurm/slurm.h.in 27 Oct 2011 18:52:22 -0000
1.1.1.53.6.2
+++ s240rc3/slurm/slurm/slurm.h.in 16 Nov 2011 23:38:08 -0000
@@ -1887,6 +1887,7 @@
char *propagate_rlimits;/* Propagate (all/specific) resource limits */
char *propagate_rlimits_except;/* Propagate all rlimits except these */
char *reboot_program; /* program to reboot the node */
+ uint16_t reconf_keep_part_state; /* keep partition state on scontrol
reconfig */
char *resume_program; /* program to make nodes full power */
uint16_t resume_rate; /* nodes to make full power, per minute */
uint16_t resume_timeout;/* time required in order to perform a node
Index: s240rc3/slurm/src/api/config_info.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/api/config_info.c,v
retrieving revision 1.1.1.40
diff -u -r1.1.1.40 config_info.c
--- s240rc3/slurm/src/api/config_info.c 18 Oct 2011 16:09:22 -0000 1.1.1.40
+++ s240rc3/slurm/src/api/config_info.c 16 Nov 2011 23:38:32 -0000
@@ -742,6 +742,14 @@
key_pair->value = xstrdup(slurm_ctl_conf_ptr->reboot_program);
list_append(ret_list, key_pair);
+ key_pair = xmalloc(sizeof(config_key_pair_t));
+ key_pair->name = xstrdup("ReconfKeepPartState");
+ if(slurm_ctl_conf_ptr->reconf_keep_part_state)
+ key_pair->value = xstrdup("YES");
+ else
+ key_pair->value = xstrdup("NO");
+ list_append(ret_list, key_pair);
+
key_pair = xmalloc(sizeof(config_key_pair_t));
key_pair->name = xstrdup("ResumeProgram");
key_pair->value = xstrdup(slurm_ctl_conf_ptr->resume_program);
Index: s240rc3/slurm/src/common/read_config.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/common/read_config.c,v
retrieving revision 1.1.1.55
diff -u -r1.1.1.55 read_config.c
--- s240rc3/slurm/src/common/read_config.c 18 Oct 2011 16:08:40 -0000
1.1.1.55
+++ s240rc3/slurm/src/common/read_config.c 16 Nov 2011 23:38:53 -0000
@@ -240,6 +240,7 @@
{"PropagateResourceLimitsExcept", S_P_STRING},
{"PropagateResourceLimits", S_P_STRING},
{"RebootProgram", S_P_STRING},
+ {"ReconfKeepPartState", S_P_BOOLEAN},
{"ResumeProgram", S_P_STRING},
{"ResumeRate", S_P_UINT16},
{"ResumeTimeout", S_P_UINT16},
@@ -1961,6 +1962,7 @@
xfree (ctl_conf_ptr->propagate_rlimits);
xfree (ctl_conf_ptr->propagate_rlimits_except);
xfree (ctl_conf_ptr->reboot_program);
+ ctl_conf_ptr->reconf_keep_part_state = (uint16_t) NO_VAL;
ctl_conf_ptr->resume_timeout = 0;
xfree (ctl_conf_ptr->resume_program);
ctl_conf_ptr->resume_rate = (uint16_t) NO_VAL;
@@ -2933,6 +2935,10 @@
conf->propagate_rlimits);
}
+ if (!s_p_get_boolean((bool *) &conf->reconf_keep_part_state,
+ "ReconfKeepPartState", hashtbl))
+ conf->reconf_keep_part_state = DEFAULT_RECONF_KEEP_PART_STATE;
+
if (!s_p_get_uint16(&conf->ret2service, "ReturnToService", hashtbl))
conf->ret2service = DEFAULT_RETURN_TO_SERVICE;
#ifdef HAVE_CRAY
Index: s240rc3/slurm/src/common/read_config.h
===================================================================
RCS file: /cvsroot/slurm/slurm/src/common/read_config.h,v
retrieving revision 1.1.1.43
diff -u -r1.1.1.43 read_config.h
--- s240rc3/slurm/src/common/read_config.h 18 Oct 2011 16:08:36 -0000
1.1.1.43
+++ s240rc3/slurm/src/common/read_config.h 16 Nov 2011 23:41:07 -0000
@@ -109,6 +109,7 @@
#define DEFAULT_PRIORITY_DECAY 604800 /* 7 days */
#define DEFAULT_PRIORITY_CALC_PERIOD 300 /* in seconds */
#define DEFAULT_PRIORITY_TYPE "priority/basic"
+#define DEFAULT_RECONF_KEEP_PART_STATE 0
#define DEFAULT_RETURN_TO_SERVICE 0
#define DEFAULT_RESUME_RATE 300
#define DEFAULT_RESUME_TIMEOUT 60
Index: s240rc3/slurm/src/common/slurm_protocol_pack.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/common/slurm_protocol_pack.c,v
retrieving revision 1.1.1.50
diff -u -r1.1.1.50 slurm_protocol_pack.c
--- s240rc3/slurm/src/common/slurm_protocol_pack.c 18 Oct 2011 16:08:30
-0000 1.1.1.50
+++ s240rc3/slurm/src/common/slurm_protocol_pack.c 16 Nov 2011 23:41:27
-0000
@@ -4539,6 +4539,7 @@
packstr(build_ptr->propagate_rlimits_except, buffer);
packstr(build_ptr->reboot_program, buffer);
+ pack16(build_ptr->reconf_keep_part_state, buffer);
packstr(build_ptr->resume_program, buffer);
pack16(build_ptr->resume_rate, buffer);
pack16(build_ptr->resume_timeout, buffer);
@@ -5385,6 +5386,7 @@
safe_unpackstr_xmalloc(&build_ptr->reboot_program, &uint32_tmp,
buffer);
+ safe_unpack16(&build_ptr->reconf_keep_part_state, buffer);
safe_unpackstr_xmalloc(&build_ptr->resume_program,
&uint32_tmp, buffer);
safe_unpack16(&build_ptr->resume_rate, buffer);
Index: s240rc3/slurm/src/slurmctld/proc_req.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/slurmctld/proc_req.c,v
retrieving revision 1.1.1.62
diff -u -r1.1.1.62 proc_req.c
--- s240rc3/slurm/src/slurmctld/proc_req.c 25 Oct 2011 14:35:51 -0000
1.1.1.62
+++ s240rc3/slurm/src/slurmctld/proc_req.c 16 Nov 2011 23:41:51 -0000
@@ -572,6 +572,7 @@
propagate_rlimits_except);
conf_ptr->reboot_program = xstrdup(conf->reboot_program);
+ conf_ptr->reconf_keep_part_state = conf->reconf_keep_part_state;
conf_ptr->resume_program = xstrdup(conf->resume_program);
conf_ptr->resume_rate = conf->resume_rate;
conf_ptr->resume_timeout = conf->resume_timeout;
Index: s240rc3/slurm/src/slurmctld/read_config.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/slurmctld/read_config.c,v
retrieving revision 1.1.1.58
diff -u -r1.1.1.58 read_config.c
--- s240rc3/slurm/src/slurmctld/read_config.c 18 Oct 2011 16:09:34 -0000
1.1.1.58
+++ s240rc3/slurm/src/slurmctld/read_config.c 16 Nov 2011 23:42:14 -0000
@@ -778,7 +778,7 @@
old_node_record_count);
error_code = MAX(error_code, rc); /* not fatal */
}
- if (old_part_list && (recover > 1)) {
+ if (old_part_list && ((recover > 1) ||
slurmctld_conf.reconf_keep_part_state)) {
info("restoring original partition state");
rc = _restore_part_state(old_part_list,
old_def_part_name);