On Jun 19, 2013, at 7:52 AM, Paul Kapinos <kapi...@rz.rwth-aachen.de> wrote:
> Hello All, > > I. > Using the new Open MPI 1.7.1 we see some messages on the console: > > > example mpiext init > > example mpiext fini > > ... on each call to MPI_INIT, MPI_FINALIZE at least in Fortran programs. > > Seems somebody forgot to disable some 'printf'-debug-output? =) This is actually from the mpiext example plugin, not from the Fortran code in OMPI. It's example code, so it has printf's in it. I'm a little surprised to see that output, though -- I wonder if it's somehow getting enabled when it shouldn't be...? How did you configure/compile Open MPI? > II. > In the 1.7.x series, the 'carto' framework has been deleted: > http://www.open-mpi.org/community/lists/announce/2013/04/0053.php > > - Removed maffinity, paffinity, and carto frameworks (and associated > > MCA params). > > Is there some replacement for this? Or, would Open MPI detect the NUMA > structure of nodes automatically? Yes. OMPI uses hwloc internally now to figure this stuff out. > Background: Currently we're using the 'carto' framework on our kinda special > 'Bull BCS' nodes. Each such node consist of 4 boards with own IB card but > build a shared memory system. Clearly, communicating should go over the > nearest IB interface - for this we use 'carto' now. It should do this automatically in the 1.7 series. Hmm; I see there isn't any verbose output about which devices it picks, though. :-( Try this patch, and run with --mca btl_base_verbose 100 and see if you see appropriate devices being mapped to appropriate processes: Index: mca/btl/openib/btl_openib_component.c =================================================================== --- mca/btl/openib/btl_openib_component.c (revision 28652) +++ mca/btl/openib/btl_openib_component.c (working copy) @@ -2712,6 +2712,8 @@ mca_btl_openib_component.ib_num_btls < mca_btl_openib_component.ib_max_btls); i++) { if (distance != dev_sorted[i].distance) { + BTL_VERBOSE(("openib: skipping device %s; it's too far away", + ibv_get_device_name(dev_sorted[i].ib_dev))); break; } -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/