Can I submit a RFE for the partition_prio preemption plugin?

Looking through the partition_prio plugins source code, from what I can gather it does not appear to be topology aware.

At least not in the way that the consumable resources selection plugin is, this one has comment blocks referring to selection based upon topology and topology state information.

Thanks

Marcin Sliwowski | SysAdmin@RENCI | 919-445-0479

On 09/23/2014 03:03 PM, Marcin Sliwowski wrote:

I'm running version 2.6.9 and wondering if the preemption algorithm takes into account the topology, as defined in topology.conf, when it selects which jobs to preempt to make room for a new higher priority MPI job.

Based on what I have seen it appears that it doesn't.

The reason I ask is that we define our infiniband topology as 8 individual fabrics because we have 8 bladecenters that each have their own fabric, they are not interconnected, one partition includes all 8 bladecenters, 32 nodes per bladecenter.

Eventually enough jobs are preempted and the MPI job is scheduled into a bladecenter, but it comes at the cost of many jobs. The main problem is that it preempts jobs on bladecenters where the MPI job does not ultimately land.

If it took into consideration our defined topology and focused on preempting jobs that reside in a single bladecenter, it could make room for the MPI job with a much lower number of preempted jobs.

We have been scratching our heads on this one for a while.

SelectType=select/cons_res
PreemptType=preempt/partition_prio
TopologyPlugin=topology/tree

Thanks

Reply via email to