is there anyway after a job starts to determine why the scheduler choose the series of nodes it did?
for some reason on an empty cluster when i spin up a large job it's staggering the allocation across a seemingly random allocation of nodes we're using backfill/cons_res + gres, and all the nodes are identical. in the past it used to select the next node past a down node and then start sequential from there. i haven't made (or are not aware of ) any changes in the system, but now it's skipping nodes that presumably should have been in the allocation