Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread tmishima
Hi Ralph, Thank you very much. I tried many things such as: mpirun -np 2 -host node05 -report-bindings -mca rmaps lama -mca rmaps_lama_bind 1c myprog But every try failed. At least they were accepted by openmpi-1.7.3 as far as I remember. Anyway, please check it when you have a time, because

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread Ralph Castain
I'll try to take a look at it - my expectation is that lama might get stuck because you didn't tell it a pattern to map, and I doubt that code path has seen much testing. On Dec 20, 2013, at 5:52 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, I'm glad to hear that, thanks. > > By

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread tmishima
Hi Ralph, I'm glad to hear that, thanks. By the way, yesterday I tried to check how lama in 1.7.4rc treat numa node. Then, even wiht this simple command line, it freezed without any massage: mpirun -np 2 -host node05 -mca rmaps lama myprog Could you check what happened? Is it better to

Re: [OMPI users] EXTERNAL: Re: What's the status of OpenMPI and thread safety?

2013-12-20 Thread Ralph Castain
Hi Ed FWIW: Intel MPI has better thread support, though you'll lose some features. I don't know what NICs you have, but have you tried running with the MTL's instead of the openib BTL? Both the psm and mxm MTLs are supposed to be thread safe, and will outperform the openib BTL anyway. Not

Re: [OMPI users] MPI_Comm_spawn and exported variables

2013-12-20 Thread Ralph Castain
Funny, but I couldn't find the code path that supported that in the latest 1.6 series release (didn't check earlier ones) - but no matter, it seems logical enough. Fixed in the trunk and cmr'd to 1.7.4 Thanks! Ralph On Dec 19, 2013, at 8:08 PM, Tim Miller wrote: > Hi

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-20 Thread Ralph Castain
Hooray! On Dec 19, 2013, at 10:14 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > Thank you for your fix. It works for me. > > Tetsuya Mishima > > >> Actually, it looks like it would happen with hetero-nodes set - only > required that at least two nodes have the same

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread Ralph Castain
I'll make it work so that NUMA can be either above or below socket On Dec 20, 2013, at 2:57 AM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Brice, > > Thank you for your comment. I understand what you mean. > > My opinion was made just considering easy way to adjust the code for > inversion

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread tmishima
Hi Brice, Thank you for your comment. I understand what you mean. My opinion was made just considering easy way to adjust the code for inversion of hierarchy in object tree. Tetsuya Mishima > I don't think there's any such difference. > Also, all these NUMA architectures are reported the

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread Brice Goglin
I don't think there's any such difference. Also, all these NUMA architectures are reported the same by hwloc, and therefore used the same in Open MPI. And yes, L3 and NUMA are topologically-identical on AMD Magny-Cours (and most recent AMD and Intel platforms). Brice Le 20/12/2013 11:33,

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-20 Thread tmishima
Hi Ralph, The numa-node in AMD Mangy-Cours/Interlagos is so called cc(cache coherent)NUMA, which seems to be a little bit different from the traditional numa defined in openmpi. I notice that ccNUMA object is almost same as L3cache object. So "-bind-to l3cache" or "-map-by l3cache" is valid

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-20 Thread tmishima
Hi Ralph, Thank you for your fix. It works for me. Tetsuya Mishima > Actually, it looks like it would happen with hetero-nodes set - only required that at least two nodes have the same architecture. So you might want to give the trunk a shot as it may well now be > fixed. > > > On Dec 19,