Ok, you can commit it. All problem is on "procs" work, on source code, "processes" AND "cores" definition is used.

Thank you for your help.
Damien

Le 01/12/2010 10:47, Ralph Castain a écrit :
I just checked and it appears bycore does correctly translate to byslot. So 
your patch does indeed appear to be correct. If you don't mind, I'm going to 
apply it for you as I'm working on a correction for how we handle oversubscribe 
flags, and I want to ensure the patch gets included so we compute oversubscribe 
correctly.

Thanks for catching this!

On Nov 30, 2010, at 10:33 PM, Ralph Castain wrote:

Afraid I don't speak much slurm any more (thank goodness!).

 From your output, It looks like the system is mapping bynode instead of byslot. IIRC, isn't bycore 
just supposed to be a pseudonym for byslot? So perhaps the problem is that "bycore" 
causes us to set the "bynode" flag by mistake. Did you check that?

BTW: when running cpus-per-proc, a slot doesn't have X processes. I suspect 
this is just a language thing, but it will create confusion. A slot consists of 
X cpus - we still assign only one process to each slot.

On Nov 30, 2010, at 10:47 AM, Damien Guinier wrote:

hi all,

Many time, there are no difference between "proc" and "slot". But when you use 
"mpirun -cpus-per-proc X", slot have X procs.
On orte/mca/rmaps/base/rmaps_base_common_mappers.c, there are a confusion 
between proc and slot. this little error impact mapping action:

On OMPI last version with 32 cores compute node:
salloc -n 8 -c 8 mpirun -bind-to-core -bycore ./a.out
[rank:0]<stdout>: host:compute18
[rank:1]<stdout>: host:compute19
[rank:2]<stdout>: host:compute18
[rank:3]<stdout>: host:compute19
[rank:4]<stdout>: host:compute18
[rank:5]<stdout>: host:compute19
[rank:6]<stdout>: host:compute18
[rank:7]<stdout>: host:compute19

with patch:
[rank:0]<stdout>: host:compute18
[rank:1]<stdout>: host:compute18
[rank:2]<stdout>: host:compute18
[rank:3]<stdout>: host:compute18
[rank:4]<stdout>: host:compute19
[rank:5]<stdout>: host:compute19
[rank:6]<stdout>: host:compute19
[rank:7]<stdout>: host:compute19

Can you say, if my patch is correct ?

Thanks you

Damien

<patch_cpu_per_rank.txt>_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Reply via email to