Hello,I think I found another bug in the code (I'm using 2.6.3 but I checked the 2.6.5 and 14.03 versions and it's the same there).
In file sched/backfill/backfill.c:
1)
_add_reservation function, from lines 1172:
if (placed == true) {
j = node_space[j].next;
if (j && (end_reserve < node_space[j].end_time)) {
/* insert end entry record */
i = *node_space_recs;
node_space[i].begin_time = end_reserve;
node_space[i].end_time = node_space[j].end_time;
node_space[j].end_time = end_reserve;
node_space[i].avail_bitmap =
bit_copy(node_space[j].avail_bitmap);
node_space[i].next = node_space[j].next;
node_space[j].next = i;
(*node_space_recs)++;
}
break;
}
I draw a picture with `node_space` state after 2 iterations (see attachment).
In case where the new reservation is fully inside another reservation,
then everything is OK.
But if the new reservation spans multiple existing reservations then
the `end entry record` is not created.
This is because only the newly created `start entry record` is checked.
Easy fix would be to change the if into a loop, for example:
if (placed == true) {
while((j = node_space[j].next) > 0) {
if (end_reserve < node_space[j].end_time) {
//same as above
break;
}
}
break;
}
2)
You could also change line 612:
node_space = xmalloc(sizeof(node_space_map_t) *
(max_backfill_job_cnt + 3));
To `(max_backfill_job_cnt * 2 + 1)` , since each reservation can add
at most two entries (check at line 982 should never execute). At the
moment, in a worst case scenario this only checks half of the
max_backfill_job_cnt.
NOTE: However this is all based on the assumption, that it is not done on purpose to speed up the calculations and trading some of the accuracy (especially point 2).
Best regards, Filip Skalski
<<attachment: node_space.png>>
