Christopher Samuel <[email protected]> writes:

> On 05/02/14 09:27, Lyn Gerner wrote:
>
>> You might check out the Weight parameter in the Node section of the
>>  slurm.conf documentation.  I believe you could just give the fat nodes
>> a higher node weight than the thinner nodes, to achieve your goal.
>
> We use it to ensure that our Xeon Phi nodes are allocated after nodes
> that don't have them, and that our 512GB nodes are allocated after the
> 256GB nodes.   Of course the 1 node that has both Xeon Phi AND 512GB is
> very heavily weighted against. :-)
>
> Here's the snippet from our slurm.conf (you can see from the Gres and
> RealMemory directives which are which):
>
> NodeName=barcoo[001-058] NodeAddr=barcoo[001-058] RealMemory=250000 Weight=2
> NodeName=barcoo[059-060] NodeAddr=barcoo[059-060] RealMemory=500000 
> Weight=1000
> NodeName=barcoo061       NodeAddr=barcoo061       RealMemory=500000 
> Gres=mic:2 Weight=100000
> NodeName=barcoo[062-070] NodeAddr=barcoo[062-070] RealMemory=250000 
> Gres=mic:2 Weight=100
>
> cheers,
> Chris

We do already use weighting, but my understanding was that this would
only affect the order in which resources are assigned and not prevent a
job from starting even when resources are available.

I assume that there is some valid reason for a job waiting, but it is
not apparent to me.  I guess it would be helpful if it were possible to
see exactly what resources a job is waiting for, but I haven't come
across a way to do that.

Cheers,

Loris

-- 
This signature is currently under construction.

Reply via email to