Hi,

> Am 02.02.2018 um 10:30 schrieb Ansgar Esztermann-Kirchner 
> <aesz...@mpibpc.mpg.de>:
> 
> On Thu, Feb 01, 2018 at 05:00:32PM +0100, Reuti wrote:
>> 
>>> Now, I think I can improve upon this choice by creating separate
>>> queues for different machines "sizes", i.e. an 8-core queue, a
>>> 20-core queue and so on.
>> 
>> So your intention is to have a bunch of queues and users select a queue 
>> instead of a dedicated PE (which would in turn select a machine from a 
>> dedicated set due to unique PEs per type of machine)?
> 
> Dedicated PEs would be another possibility, queues ware just the first
> thing that came to mind.
> With the current configuration, we only have one PE. It is set to
> $pe_slots.
> Users do not select a PE, but rather a slot range. The idea is that
> the scheduler selects an appropriate host.
> 
>> Somehow I don't get the advantage you want to achieve.
> 
> I want to prevent "small" jobs from running on large "nodes".

Aha, now I see the goal of it. We had a similar requirement regarding the 
amount of installed memory. Essentially my solution might be adapted to your 
case.

We have nodes with 64 GB of memory and some with 1 TB of it, all with 16 cores. 
Now the corner cases are:

- one large serial job is running on a 64 GB nodes and 15 cores are damed to 
idle
- 16 small jobs with a 1 GB request of virtual_free are running on the 1 TB 
nodes and most of the memory is unused

My setup used the amount of requested virtual_free to attach a soft or hard 
request for a certain type of machine in a JSV:

# virtual_free <= 4 GB: -hard smallmem=true
# 4 GB < virtual_free <= 8 GB: -soft smallmem=true
# 8 GB < virtual_free < 16 GB:
# 16 GB <= virtual_free < 32 GB: -soft bigmem=true
# 32 GB <= virtual_free: -hard bigmem=true

As one might guess, the 64 GB  nodes got the smallmem=true attached and the 1 
TB nodes bigmem=true, while both are not forced and so jobs requesting only a 
soft or none of these complexes at all can run on either machine.

===

You could reuse my script and select the type of machine depending on the 
number of requested cores – possibly introducing some "midmem" complex 
transferred to "midcpu" (or leave the machines with a medium amount of cores 
unspecified). I think attachments won't get through, let me know in case you 
would like to get the Perl script.

-- Reuti
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to