That looks fine. You might also find "--cpu_bind=verbose" helpful:

$ srun -N1 -n4 --cpu_bind=verbose true
cpu_bind=MASK - smd2, task  0  0 [3209]: mask 0x4 set
cpu_bind=MASK - smd2, task  2  2 [3211]: mask 0x8 set
cpu_bind=MASK - smd2, task  1  1 [3210]: mask 0x10 set
cpu_bind=MASK - smd2, task  3  3 [3212]: mask 0x20 set



Quoting Evren Yurtesen IB <[email protected]>:

> I used a small code to see exactly what is happening. (below)
> compiled it:
> mpicc -fopenmp cpuid.c -o cpuid
>
> Then ran it:
>   srun -c 2 -n 2 -N 1 -m block ./cpuid  |sort -n
>
> It prints out:
> Rank 0 CPU4 is set
> Rank 0 CPU5 is set
> Rank 1 CPU16 is set
> Rank 1 CPU17 is set
>
> The box has 24 logical processors, so processor 4 and 16 are same  
> physical core and 5 and 17 are same physical core. So I should have  
> seen rank 0 having CPU4 & CPU16
>
>
> I tested something else (thanks for the hint about -m :)
>
> -bash-4.1$ srun -c 2 -n 4 -N 1 -m plane=1  ./cpuid  |sort -n
> srun: job 116180 queued and waiting for resources
> srun: job 116180 has been allocated resources
> Rank 0 CPU17 is set
> Rank 0 CPU5 is set
> Rank 1 CPU15 is set
> Rank 1 CPU3 is set
> Rank 2 CPU11 is set
> Rank 2 CPU23 is set
> Rank 3 CPU19 is set
> Rank 3 CPU7 is set
> -bash-4.1$
>
> This seem to be working as I expect. Every rank is sharing 2 thread  
> from a single core. Does this look correct?
>
>
>
> #define _GNU_SOURCE
> #include <sched.h>
> #include <stdio.h>
> #include <mpi.h>
>
> int main(int argc, char *argv[])
> {
>   int id;
>   int rank, size;
>
>   MPI_Init (&argc, &argv);      /* starts MPI */
>   MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
>   MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
>
>   cpu_set_t set;
>   CPU_ZERO(&set);
>   sched_getaffinity(0,sizeof(cpu_set_t),&set);
>
>   int i;
>   for (i=0; i<CPU_SETSIZE; i++) {
>     if( CPU_ISSET(i, &set) ) {
>       printf("Rank %d CPU%i is set\n",rank,i);
>     }
>   }
>
>   return 0;
> }
>
>
> On Thu, 7 Mar 2013, [email protected] wrote:
>
>> Assuming you're allocating two logical cpus per task (-c 2) and  
>> your hyperthreading uses two threads per core (ThreadsPerCore=2),  
>> you should be able to
>> force each task to bind to two threads on the same core using -m xxxx:block.
>> --threads-per-core=2 will constrain node selection to nodes with at  
>> least two threads per core.
>>
>> Regards,
>> Martin Perry
>>
>>
>>
>> From:        Aaron Knister <[email protected]>
>> To:        "slurm-dev" <[email protected]>,
>> Date:        03/07/2013 04:56 AM
>> Subject:        [slurm-dev] Re: hyperthreading scheduling question
>>
>> ____________________________________________________________________________________________________________________________________________________________
>>
>>
>>
>>
>> As far as a users mailing list, this is it I believe.
>>
>> So, the desired result is 1 process with 2 hardware threads(logical  
>> cores)? You could try the -c 2 option but there's no guarantee that  
>> both allocated
>> logical cores will map to two hardware threads on the same physical  
>> core. You could also try --threads-per-core=2 perhaps in  
>> conjunction with the -c option.
>>
>> Sent from my iPhone
>>
>> On Mar 7, 2013, at 4:31 AM, Evren Yurtesen IB <[email protected]> wrote:
>>
>>>
>>> Hello,
>>>
>>> First question, dont you have a users mailing list?
>>>
>>> I am on a system with 2 sockets, 6 cores per socket and  
>>> withhyper-threading. I have the following options set:
>>>
>>> TaskPlugin=task/affinity
>>> TaskPluginParam=threads
>>> SelectType=select/cons_res
>>> SelectTypeParameters=CR_Core_Memory
>>>
>>> I am trying to run a process and allocate 2 threads per core.  
>>> However it looks like slurm is allocating this strangely and 2  
>>> processes end up in the
>> threads of 1 core. I am not 100% certain if this is the case but I  
>> thought I would ask how is the best way to accomplish this?
>>>
>>> Thanks,
>>> Evren
>>
>> [spacer.png]
>>

Reply via email to