Michel,
FYI, I just tested this on 2.4.0-pre3. ALLOCATE_FULL_SOCKET seems to be 
working on that version as documented. 

SelectType=select/cons_res
SelectTypeParameters=CR_Socket
TaskPlugin=task/affinity
TaskPluginParam=sched

WITH ALLOCATE_FULL_SOCKET = 0 (allocate only the requested number of 
cpus):

[sulu] (slurm) etc> srun -n1 -c1 --cpu_bind=verbose,sockets ...
[2] 2629
[sulu] (slurm) etc> cpu_bind=MASK - bones, task  0  0 [26337]: mask 0x1 
set

WITH ALLOCATE_FULL_SOCKET = 1 (allocate all cpus on the consumed sockets):

[sulu] (slurm) etc> srun -n1 -c1 --cpu_bind=verbose,sockets ...
[2] 9074
[sulu] (slurm) etc> cpu_bind=MASK - bones, task  0  0 [26466]: mask 0x55 
set

Martin Perry
Bull Phoenix









[slurm-dev] Re: select/cons_res ALLOCATE_FULL_SOCKET

Michel Bourget 
to:
slurm-dev
02/03/2012 07:21 PM





From:
Michel Bourget <[email protected]>




To:
"slurm-dev" <[email protected]>





Please respond to "slurm-dev" <[email protected]>





On 02/03/2012 08:15 PM, Moe Jette wrote:
> I do believe that it works as described in the comments, although it
> hasn't been tested in a while so verify that it works as desired. It
> was added after the original code as developed, which is why both
> CR_SOCKET and CR_CORE are set (it accomplishes the desired
> functionality with minimal code changes).
>

Hi Moe,

thanks your feedback.

Well, CR_CORE == CR_SOCKET --> alloc_cores=true, by default,  according to
_cyclic_sync_core_bitmat/_block_sync_core_bitmap distribution. That is a 
little confusing,
to be honest. Anyway, I tried ALLOCATE_FULL_SOCKET=1 and I didn't notice 
any change
when CR_Socket SelectTypeParam is selected. Iow,  Still the mask I 
obtain for

     Procs=32 Sockets=4 CoresPerSocket=4 ThreadsPerCore=2

  using N=1 n=1 c=5 is  0x70007000 and I'd expect it to be 0xF000F000, ie
I'd expect the effect of ALLOCATE_FULL_SOCKET to chime in.  Hence my 
initial intuition
something is wrong, incomplete, whatever. Or I totally missed it; that 
could also be true.
( Btw, slurm 2.2.3. )

Since memory locality ( ldoms ) and sockets memory zones are pretty 
tight, at least on SGI
UV systems, using "--cpu_bind=ldoms" seems like a WAR. I get the 
expected mask, 0xF000F000.

             if (cr_type & CR_CORE)
                     alloc_cores = true;
    #ifdef ALLOCATE_FULL_SOCKET
             if (cr_type & CR_SOCKET)
                     alloc_sockets = true;
    #else
             if (cr_type & CR_SOCKET)
                     alloc_cores = true;
    #endif


At any rate, the task/affinnity always get from the srun credentials the 
same input mask:
slurmd: Cred: job_core_bitmap   12-14


    Unless I am really wrong, Above should be 12-15 when
    ALLOCATE_FULL_SOCKET is defined as 1.

slurmd: Cred: step_core_bitmap  12-14
slurmd: Cred: job_nhosts        1
slurmd: debug3: task/affinity: slurmctld s 4 c 4; hw s 4 c 4 t 2
slurmd: debug3: task/affinity: job 816.0 CPU mask from slurmctld: 0x7000
slurmd: debug3: task/affinity: job 816.0 CPU final mask for local node: 
0x3F000000
slurmd: debug:  task affinity : after lllp distribution cpu bind method 
is 'verbose,mask_cpu' (0x70007000)

What am I missing ?

> #if(0)
> /* Using CR_SOCKET or CR_SOCKET_MEMORY will not allocate a socket to 
more
>    * than one job at a time, but it also will not grant a job access to 
more
>    * CPUs on the socket than requested. If ALLOCATE_FULL_SOCKET is 
defined,
>    * then a job will be given access to every cores on each allocated 
socket.
>    */
> #define ALLOCATE_FULL_SOCKET 1
> #endif
>
> Quoting Michel Bourget<[email protected]>:
>
>> Hi,
>>
>> given the description in the source, I am considering enabling this
>> feature. But it seems incomplete. Am i missing something ? What
>> puzzles me is CR_SOCKET and CR_CORE seems to be "equal" when
>> ALLOCATE_FULL_SOCKET
>> is disabled. What is the rationale there ? A todo ?
>>
>> Tia
>>
>> -- 
>>
>> -----------------------------------------------------------
>>       Michel Bourget - SGI - Linux Software Engineering
>>      "Past BIOS POST, everything else is extra" (travis)
>> -----------------------------------------------------------
>>
>>


-- 

-----------------------------------------------------------
      Michel Bourget - SGI - Linux Software Engineering
     "Past BIOS POST, everything else is extra" (travis)
-----------------------------------------------------------


Reply via email to