I was able to reproduce the problem both with SLURM v2.1 and 2.2, 
although failure is very much configuration dependent. The attached
patch should fix this problem and will be in SLURM v2.2.4, which we
hope to release next week.
________________________________________
From: [email protected] [[email protected]] On Behalf 
Of Dennis Leepow [[email protected]]
Sent: Tuesday, March 22, 2011 2:30 PM
To: slurm-dev
Subject: [slurm-dev] Problems with nodes selection

Hello, I have the following slurm nodes/partitions configuration:
SelectType=select/linear
NodeName=x[0-3] Feature=GPU Weight=2
NodeName=x[4-15] Weight=1
PartitionName=graph Nodes=x[0-15]
I want to start job on 4 nodes with allocation of one GPU-node at least
(assume all nodes x[0-15] are idle), e.g.:
srun -p graph -N 4 -C "GPU*1" hostname -s
slurm-2.1.15 allocates nodes x[0,4-6]. That's expectedly (3 nodes with
less weight=1 and 1 node with GPU feature).
We upgraded slurm to 2.2.0 version and discovered it allocates nodes
x[0-3] unexpectedly for the same nodes configuration and the same job.
If this is the correct slurm behaviour or it's the bug ?
What part of slurm code may be responsible for this behaviour ?

--
Best regards, Dennis.

Index: src/slurmctld/node_scheduler.c
===================================================================
--- src/slurmctld/node_scheduler.c	(revision 22840)
+++ src/slurmctld/node_scheduler.c	(working copy)
@@ -731,6 +731,32 @@
 	/* Accumulate resources for this job based upon its required
 	 * features (possibly with node counts). */
 	for (j = min_feature; j <= max_feature; j++) {
+		if (job_ptr->details->req_node_bitmap) {
+			bool missing_required_nodes = false;
+			for (i = 0; i < node_set_size; i++) {
+				if (!bit_test(node_set_ptr[i].feature_bits, j))
+					continue;
+				if (avail_bitmap) {
+					bit_or(avail_bitmap,
+					       node_set_ptr[i].my_bitmap);
+				} else {
+					avail_bitmap = bit_copy(node_set_ptr[i].
+								my_bitmap);
+					if (avail_bitmap == NULL)
+						fatal("bit_copy malloc failure");
+				}
+			}
+			if (!bit_super_set(job_ptr->details->req_node_bitmap,
+					   avail_bitmap))
+				missing_required_nodes = true;
+			FREE_NULL_BITMAP(avail_bitmap);
+			if (missing_required_nodes)
+				continue;
+			avail_bitmap = bit_copy(job_ptr->details->
+						req_node_bitmap);
+			if (avail_bitmap == NULL)
+				fatal("bit_copy malloc failure");
+		}
 		for (i = 0; i < node_set_size; i++) {
 			if (!bit_test(node_set_ptr[i].feature_bits, j))
 				continue;
@@ -776,11 +802,10 @@
 			}
 			avail_nodes = bit_set_count(avail_bitmap);
 			tried_sched = false;	/* need to test these nodes */
-
-			if (((i+1) < node_set_size)	&&
-			    (shared || preempt_flag ||
-			     (node_set_ptr[i].weight ==
-			      node_set_ptr[i+1].weight))) {
+			if ((shared || preempt_flag)	&&
+			    ((i+1) < node_set_size)	&&
+			    (node_set_ptr[i].weight ==
+			     node_set_ptr[i+1].weight)) {
 				/* Keep accumulating so we can pick the
 				 * most lightly loaded nodes */
 				continue;
@@ -792,11 +817,6 @@
 			     ((i+1) < node_set_size)))
 				continue;	/* Keep accumulating nodes */
 
-			if ((job_ptr->details->req_node_bitmap) &&
-			    (!bit_super_set(job_ptr->details->req_node_bitmap,
-					    avail_bitmap)))
-				continue;
-
 			/* NOTE: select_g_job_test() is destructive of
 			 * avail_bitmap, so save a backup copy */
 			backup_bitmap = bit_copy(avail_bitmap);

Reply via email to