On 02/09/2012 06:17 PM, [email protected] wrote:
>
> Michel,
> FYI, I just tested this on 2.4.0-pre3. ALLOCATE_FULL_SOCKET seems to 
> be working on that version as documented.
>
> SelectType=select/cons_res
> SelectTypeParameters=CR_Socket
> TaskPlugin=task/affinity
> TaskPluginParam=sched

But I use cpusets ...  Should not make a difference.

>
> WITH ALLOCATE_FULL_SOCKET = 0 (allocate only the requested number of 
> cpus):
>
> [sulu] (slurm) etc> srun -n1 -c1 --cpu_bind=verbose,sockets ...
> [2] 2629
> [sulu] (slurm) etc> cpu_bind=MASK - bones, task  0  0 [26337]: mask 
> 0x1 set
>
> WITH ALLOCATE_FULL_SOCKET = 1 (allocate all cpus on the consumed 
> sockets):
>
> [sulu] (slurm) etc> srun -n1 -c1 --cpu_bind=verbose,sockets ...
> [2] 9074
> [sulu] (slurm) etc> cpu_bind=MASK - bones, task  0  0 [26466]: mask 
> 0x55 set
>

Hi Martin,

thanks for the feedback.

I'd like to see a numa cpumap output so I trust that mask. I wonder if 
you tried with:
-n 1 -c2 for example. In my case, I get strange results. In my case, we 
have quad-core
sockets ( 4 sockets )  and I obtain the following mask ( always using -n1 ):

                                      11 1111 1111 2222 2222 2233
mask        c            0123 4567 8901 2345 6789 0123 4567 8901

0xf000f000  1            0000 0000 0000 1111 0000 0000 0000 1111
0xf000f000  2 0000 0000 0000 1111 0000 0000 0000 1111
0x70007000  3   uh ?     0000 0000 0000 1110 0000 0000 0000 1110
0xf000f000  4 0000 0000 0000 1111 0000 0000 0000 1111
0x70007000  5   uh ? 0000 0000 0000 1110 0000 0000 0000 1110
0xf000f000  7 0000 0000 0000 1111 0000 0000 0000 1111
0xf000f000  8 0000 0000 0000 1111 0000 0000 0000 1111
0x3f003f00  9   uh ?     0000 0000 1111 1100 0000 0000 1111 1100

Aren't we agreeing this is wrong ? Do you get the same odd behavior on 
your end ?
Btw, this is 2.3.3. Same for you ?

A+

> Martin Perry
> Bull Phoenix
>
>
>
>
>
>
>
>       
> *[slurm-dev] Re: select/cons_res ALLOCATE_FULL_SOCKET*
>
>
> *Michel Bourget *     to:     slurm-dev       
> 02/03/2012 07:21 PM
>
>
>
>       
>       
> From:         Michel Bourget <[email protected]>         
>
>       
>       
> To:   "slurm-dev" <[email protected]>     
>
>       
>       
>
>       *Please respond to "slurm-dev" <[email protected]>*         
>
>
>
>
>
>
> On 02/03/2012 08:15 PM, Moe Jette wrote:
> > I do believe that it works as described in the comments, although it
> > hasn't been tested in a while so verify that it works as desired. It
> > was added after the original code as developed, which is why both
> > CR_SOCKET and CR_CORE are set (it accomplishes the desired
> > functionality with minimal code changes).
> >
>
> Hi Moe,
>
> thanks your feedback.
>
> Well, CR_CORE == CR_SOCKET --> alloc_cores=true, by default,  according to
> _cyclic_sync_core_bitmat/_block_sync_core_bitmap distribution. That is a
> little confusing,
> to be honest. Anyway, I tried ALLOCATE_FULL_SOCKET=1 and I didn't notice
> any change
> when CR_Socket SelectTypeParam is selected. Iow,  Still the mask I
> obtain for
>
>     Procs=32 Sockets=4 CoresPerSocket=4 ThreadsPerCore=2
>
>  using N=1 n=1 c=5 is  0x70007000 and I'd expect it to be 0xF000F000, ie
> I'd expect the effect of ALLOCATE_FULL_SOCKET to chime in.  Hence my
> initial intuition
> something is wrong, incomplete, whatever. Or I totally missed it; that
> could also be true.
> ( Btw, slurm 2.2.3. )
>
> Since memory locality ( ldoms ) and sockets memory zones are pretty
> tight, at least on SGI
> UV systems, using "--cpu_bind=ldoms" seems like a WAR. I get the
> expected mask, 0xF000F000.
>
>             if (cr_type & CR_CORE)
>                     alloc_cores = true;
>    #ifdef ALLOCATE_FULL_SOCKET
>             if (cr_type & CR_SOCKET)
>                     alloc_sockets = true;
>    #else
>             if (cr_type & CR_SOCKET)
>                     alloc_cores = true;
>    #endif
>
>
> At any rate, the task/affinnity always get from the srun credentials the
> same input mask:
> slurmd: Cred: job_core_bitmap   12-14
>
>
>    Unless I am really wrong, Above should be 12-15 when
>    ALLOCATE_FULL_SOCKET is defined as 1.
>
> slurmd: Cred: step_core_bitmap  12-14
> slurmd: Cred: job_nhosts        1
> slurmd: debug3: task/affinity: slurmctld s 4 c 4; hw s 4 c 4 t 2
> slurmd: debug3: task/affinity: job 816.0 CPU mask from slurmctld: 0x7000
> slurmd: debug3: task/affinity: job 816.0 CPU final mask for local node:
> 0x3F000000
> slurmd: debug:  task affinity : after lllp distribution cpu bind method
> is 'verbose,mask_cpu' (0x70007000)
>
> What am I missing ?
>
> > #if(0)
> > /* Using CR_SOCKET or CR_SOCKET_MEMORY will not allocate a socket to 
> more
> >    * than one job at a time, but it also will not grant a job access 
> to more
> >    * CPUs on the socket than requested. If ALLOCATE_FULL_SOCKET is 
> defined,
> >    * then a job will be given access to every cores on each 
> allocated socket.
> >    */
> > #define ALLOCATE_FULL_SOCKET 1
> > #endif
> >
> > Quoting Michel Bourget<[email protected]>:
> >
> >> Hi,
> >>
> >> given the description in the source, I am considering enabling this
> >> feature. But it seems incomplete. Am i missing something ? What
> >> puzzles me is CR_SOCKET and CR_CORE seems to be "equal" when
> >> ALLOCATE_FULL_SOCKET
> >> is disabled. What is the rationale there ? A todo ?
> >>
> >> Tia
> >>
> >> --
> >>
> >> -----------------------------------------------------------
> >>       Michel Bourget - SGI - Linux Software Engineering
> >>      "Past BIOS POST, everything else is extra" (travis)
> >> -----------------------------------------------------------
> >>
> >>
>
>
> -- 
>
> -----------------------------------------------------------
>      Michel Bourget - SGI - Linux Software Engineering
>     "Past BIOS POST, everything else is extra" (travis)
> -----------------------------------------------------------
>
>


-- 

-----------------------------------------------------------
      Michel Bourget - SGI - Linux Software Engineering
     "Past BIOS POST, everything else is extra" (travis)
-----------------------------------------------------------

Reply via email to