Re: [OMPI users] How can I tell (open-)mpi about the HW topology of mysystem?

2009-10-26 Thread Jeff Squyres
The short answer is that OMPI currently does not remap ranks during  
MPI_CART_CREATE, even if you pass reorder==1.  :-\


The reason is because we've had very little requests to do so.

However, we do have the good foresight (if I do say so myself ;-) ) to  
make the MPI topology system be a plugin in Open MPI.  The only plugin  
for this system is currently the "do nothing" plugin, but it would  
*not* be difficult to write one that actually did something meaningful  
in your torus.


If you're interested, I'd be happy to explain how to do it (and we  
should probably move to the devel list).  OMPI doesn't require too  
much framework code; I would guess that the majority of the code would  
actually be implementing whatever algorithms you wanted for your  
torus.  Heck, you could even write a blind-and-dumb algorithm that  
just looks up tables in files based on hostnames in your torus.




On Oct 23, 2009, at 7:54 AM, Luigi Scorzato wrote:


Hi everybody,

The short question is: How can I tell (open-)mpi about the HW
topology of my system?

The longer form is the following, I have a cluster which is
physically connected in a 3D torus topology (say 5x3x2). The nodes
have names: node_000, node_001, ... node_421. I can use a rankfile to
assign a fix MPI rank to each node. E.g:
rank 0 = node_000
rank 1 = node_001
rank 2 = node_010
...
However, in general, nothing forces e.g. MPI_Cart_create() to build
the 3D grid I want i.e. coord[node_ijk] =(i,j,k) rather than, say
coord[node_000] =(0,0,0), coord[node_001] =(1,0,0), coord[node_010] =
(2,0,0) ..., which would be wrongly mapped to the physical topology.

How can I bind at least MPI_Cart_create() to the topology I want? Of
course it would be nice to use an MPI-compliant procedure, if it
exists. If not, I am also happy with something that works at least
with some version of open-mpi.

Note: For some reason too long to explain I cannot rely on a system
that tests the the connections at the beginning. But the is no reason
to do these tests, since I know my topology exactly.

Thanks in advance for any help!
Luigi
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Ashley Pittman
On Mon, 2009-10-26 at 16:21 -0400, Jeff Squyres wrote:

> there's a tiny/ 
> small amount of overhead inserted by OMPI telling Valgrind "this  
> memory region is ok", but we live in an intensely competitive HPC  
> environment.

I may be wrong but I seem to remember Julian saying the overhead is
twelve cycles for the valgrind calls.  Of course calculating what to
pass to valgrind may add to this.

> The option to enable this Valgrind Goodness in OMPI is --with- 
> valgrind.  I *think* the option may be the same for libibverbs, but I  
> don't remember offhand.
> 
> That being said, I'm guessing that we still have bunches of other  
> valgrind warnings that may be legitimate.  We can always use some help  
> to stamp out these warnings...  :-)

I note there is a bug for this, being "Valgrind clean" is a very
desirable feature for any software and particularly a library IMHO.

https://svn.open-mpi.org/trac/ompi/ticket/1720

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk



Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Jed Brown
Jeff Squyres wrote:

> Verbs and Open MPI don't have these options on by default because a)
> you need to compile against Valgrind's header files to get them to
> work, and b) there's a tiny/small amount of overhead inserted by OMPI
> telling Valgrind "this memory region is ok", but we live in an
> intensely competitive HPC environment.

It's certainly competitive, but we spend most of our implementation time
getting things correct rather than tuning.  The huge speed benefits come
from algorithmic advances, and finding bugs quickly makes the
implementation of new algorithms easier.  I'm not arguing that it should
be on by default, but it's helpful to have an environment where the
lower-level libs are valgrind-clean.  These days, I usually revert to
MPICH when hunting something with valgrind, but use OMPI most other
times.

> The option to enable this Valgrind Goodness in OMPI is --with-valgrind. 
> I *think* the option may be the same for libibverbs, but I don't
> remember offhand.

I see plenty of warning over btl sm.  Several variations, including the
excessive

--enable-debug --enable-mem-debug --enable-mem-profile \
   --enable-memchecker --with-valgrind=/usr

were not sufficient.  (I think everything in this line except
--with-valgrind increases the number of warnings, but it's nontrivial
with plain --with-valgrind.)


Thanks,

Jed



signature.asc
Description: OpenPGP digital signature


Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Jeff Squyres
There's a whole class of valgrind warnings that are generated when you  
use OS-bypass networks like OpenFabrics.  The verbs library and Open  
MPI can be configured and compiled with additional instructions that  
tell Valgrind where the "problematic" spots are, and that the memory  
is actually ok (because it's memory that came from outside of  
Valgrind's scope of influence).  Verbs and Open MPI don't have these  
options on by default because a) you need to compile against  
Valgrind's header files to get them to work, and b) there's a tiny/ 
small amount of overhead inserted by OMPI telling Valgrind "this  
memory region is ok", but we live in an intensely competitive HPC  
environment.


The option to enable this Valgrind Goodness in OMPI is --with- 
valgrind.  I *think* the option may be the same for libibverbs, but I  
don't remember offhand.


That being said, I'm guessing that we still have bunches of other  
valgrind warnings that may be legitimate.  We can always use some help  
to stamp out these warnings...  :-)



On Oct 26, 2009, at 4:09 PM, Jed Brown wrote:


Samuel K. Gutierrez wrote:
> Hi Jed,
>
> I'm not sure if this will help, but it's worth a try.  Turn off  
OMPI's

> memory wrapper and see what happens.
>
> c-like shell
> setenv OMPI_MCA_memory_ptmalloc2_disable 1
>
> bash-like shell
> export OMPI_MCA_memory_ptmalloc2_disable=1
>
> Also add the following MCA parameter to you run command.
>
> --mca mpi_leave_pinned 0

Thanks for the tip, but these make very little difference.

Jed


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Jed Brown
Samuel K. Gutierrez wrote:
> Hi Jed,
> 
> I'm not sure if this will help, but it's worth a try.  Turn off OMPI's
> memory wrapper and see what happens.
> 
> c-like shell
> setenv OMPI_MCA_memory_ptmalloc2_disable 1
> 
> bash-like shell
> export OMPI_MCA_memory_ptmalloc2_disable=1
> 
> Also add the following MCA parameter to you run command.
> 
> --mca mpi_leave_pinned 0

Thanks for the tip, but these make very little difference.

Jed



signature.asc
Description: OpenPGP digital signature


Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Samuel K. Gutierrez

Hi Jed,

I'm not sure if this will help, but it's worth a try.  Turn off OMPI's  
memory wrapper and see what happens.


c-like shell
setenv OMPI_MCA_memory_ptmalloc2_disable 1

bash-like shell
export OMPI_MCA_memory_ptmalloc2_disable=1

Also add the following MCA parameter to you run command.

--mca mpi_leave_pinned 0

--
Samuel K. Gutierrez
Los Alamos National Laboratory


On Oct 26, 2009, at 1:41 PM, Jed Brown wrote:


Jeff Squyres wrote:

Using --enable-debug adds in a whole pile of developer-level run-time
checking and whatnot.  You probably don't want that on production  
runs.


I have found that --enable-debug --enable-memchecker actually produces
more valgrind noise than leaving them off.  Are there options to make
Open MPI strict about initializing and freeing memory?  At one point I
tried to write policy files, but even with judicious globbing, I kept
getting different warnings when run on a different program.  (All  
these

codes were squeaky-clean under MPICH2.)

Jed

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Jed Brown
Jeff Squyres wrote:
> Using --enable-debug adds in a whole pile of developer-level run-time
> checking and whatnot.  You probably don't want that on production runs.

I have found that --enable-debug --enable-memchecker actually produces
more valgrind noise than leaving them off.  Are there options to make
Open MPI strict about initializing and freeing memory?  At one point I
tried to write policy files, but even with judicious globbing, I kept
getting different warnings when run on a different program.  (All these
codes were squeaky-clean under MPICH2.)

Jed



signature.asc
Description: OpenPGP digital signature


Re: [OMPI users] memchecker overhead?

2009-10-26 Thread Brock Palen

On Oct 26, 2009, at 3:29 PM, Jeff Squyres wrote:


On Oct 26, 2009, at 3:23 PM, Brock Palen wrote:


Is there a large overhead for
--enable-debug --enable-memchecker?



--enable-debug, yes, there is a pretty large penalty.  --enable- 
debug is really only intended for Open MPI developers.  If you just  
want an OMPI that was compiled with debugging symbols, then just add  
-g to the CFLAGS/CXXFLAGS in OMPI's configure, perhaps like this:


Interesting, we were just looking at the memchecker functionality and  
don't want to double the number of MPI builds we offer. In the Debugg  
FAQ section 10

http://www.open-mpi.org/faq/?category=debugging#memchecker_how

It says you need --enable-debug to use --enable-memchecker, is this  
really the case then?




 shell$ ./configure CFLAGS=-g CXXFLAGS=-g ...

Using --enable-debug adds in a whole pile of developer-level run- 
time checking and whatnot.  You probably don't want that on  
production runs.


I'll let the HLRS guys comment on the cost of --enable-memchecker; I  
suspect the answer will be "it depends".


--
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] compiling openmpi with mixed CISCO infiniband. cardand Mellanox infiniband cards.

2009-10-26 Thread Jeff Squyres

On Oct 16, 2009, at 1:55 PM, nam kim wrote:

Our school has a cluster running over CISCO based Infiniband cards  
and switch.

Recently, we purchased more computing nods with Mellanox card since
CISCO stops making IB card anymore.



Sorry for the delay in replying; my INBOX has grown totally out of  
hand recently.  :-(


FWIW, Cisco never made IB HCAs; we simply resold Mellanox HCAs.

Currently, I use openmpi 1.2.8 compiled with CISCO IB card (SFS- 
HCA-320-A1) with topspin driver. My questions are:


1. Is it possible to compile 1.3 version with mixed cisco IB and  
mellanox IB (MHRH19-XTC) with open infiniband libries?




Do you mean: is it possible to use Open MPI 1.3.x with a recent OFED  
distribution across multiple nodes, some of which include Cisco- 
branded HCAs and some of which include Mellanox HCAs?


The answer is: most likely, yes.  Open MPI doesn't fully support  
"heterogeneous" HCAs (e.g., HCAs that would require different MTUs).   
But I suspect that your HCAs are all "close enough" that it won't  
matter.  FWIW, on my 64-node MPI testing cluster at Cisco, I do  
similar things -- I have various Cisco and Mellanox HCAs of different  
generations and specific capabilities, and Open MPI runs fine.


2. Is is possible to compile 1.2.8 with mixed cisco IB and mellanox  
IB, then how?





If you can, I'd highly suggest upgrading to the Open MPI v1.3 series.

--
Jeff Squyres
jsquy...@cisco.com



[OMPI users] memchecker overhead?

2009-10-26 Thread Brock Palen

Is there a large overhead for
--enable-debug --enable-memchecker?

reading:
http://www.open-mpi.org/faq/?category=debugging

It sounds like there is and there isn't, what should I expect if we  
build all of our mpi libraries with those options, when we run normally:


mpirun ./myexe

vs using a library that was not built with those options?

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985





Re: [OMPI users] Openmpi not using IB and no warning message

2009-10-26 Thread Jeff Squyres

On Oct 15, 2009, at 2:14 AM, Sangamesh B wrote:


 I've run ibpingpong tests. They are working fine.


Sorry for the delay in replying.

Good.

 Are there any additional tests available which will make sure that  
"there is no problem with IB software and Open MPI. The problem is  
with Application or IB hardware"?


George mentioned the point that using "--mca btl openib,self" will  
only allow OMPI to use those two networks.  So you should be good  
there -- with those command line options, it'll either run on IB or it  
will fail to run if the IB is not working.


Unfortunately, OMPI currently only has a negative acknowledgement when  
you're *not* using high-performance networks -- it doesn't give you a  
positive acknowledgement when it *is* using a high-performance network  
(because this is the much more common case).



Because we've faced some critical problems:

http://www.open-mpi.org/community/lists/users/2009/10/10843.php


This one *appears* to be an application issue.  But there was no  
information provided beyond the initial posting, so it's impossible to  
say.



http://www.open-mpi.org/community/lists/users/2009/09/10700.php


Pasha had a good reply to this post:

http://www.open-mpi.org/community/lists/users/2009/09/10705.php

If he's right (and he usually is :-) ), then one of your IB ports when  
from ACTIVE to DOWN during the run, potentially indicating bad  
hardware (i.e., Open MPI simply reported the error -- it's possible/ 
likely that Open MPI didn't *cause* the error).  Pasha suggested using  
ibdiagnet to verify your fabric.  Failing that, you might want to  
contact your IB/cluster vendor for assistance with a layer-0  
diagnostic of your IB fabric.


Hope that helps!

--
Jeff Squyres
jsquy...@cisco.com



[OMPI users] segmentation fault: Address not mapped

2009-10-26 Thread Iris Pernille Lohmann
Dear list members

I am using openmpi 1.3.3 with OFED on a HP cluster with redhatLinux.

Occasionally (not always) I get a crash with the following message:

[hydra11:09312] *** Process received signal ***
[hydra11:09312] Signal: Segmentation fault (11)
[hydra11:09312] Signal code: Address not mapped (1)
[hydra11:09312] Failing at address: 0xab5f30a8
[hydra11:09312] [ 0] /lib64/libpthread.so.0 [0x3c1400e4c0]
[hydra11:09312] [ 1] 
/home/ipl/openmpi-1.3.3/platforms/hp/lib/libmpi.so.0(MPI_Isend+0x93) 
[0x2af1be45a3e3]
[hydra11:09312] [ 2] ./flow(MP_SendReal+0x60) [0x6bc993]
[hydra11:09312] [ 3] ./flow(SendRealsAlongFaceWithOffset_3D+0x4ab) [0x68ba19]
[hydra11:09312] [ 4] ./flow(MP_SendVertexArrayBlock+0x23d) [0x6891e1]
[hydra11:09312] [ 5] ./flow(MB_CommAllVertex+0x65) [0x6848ba]
[hydra11:09312] [ 6] ./flow(MB_SetupVertexArray+0xd5) [0x68c837]
[hydra11:09312] [ 7] ./flow(MB_SetupGrid+0xa8) [0x68be51]
[hydra11:09312] [ 8] ./flow(SetGrid+0x58) [0x446224]
[hydra11:09312] [ 9] ./flow(main+0x148) [0x43b728]
[hydra11:09312] [10] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c1341d974]
[hydra11:09312] [11] ./flow(__gxx_personality_v0+0xd9) [0x429b19]
[hydra11:09312] *** End of error message ***
--
mpirun noticed that process rank 6 with PID 9312 on node hydra11 exited on 
signal 11 (Segmentation fault).
--

The crash does not appear always - sometimes the application runs fine. 
However, it seems that the crash especially occurs when I run on more than 1 
node.

I have consulted the archive of open-mpi and have found many error messages of 
the same kind, but none from the 1.3.3 version, and none of direct relevance.

I would really appreciate comments on this. Below is the information required 
according to the openmpi web,

Config.log: attached (config.zip)
Open mpi was configured with prefix and with the path to openib, and with the 
following compiler flags
setenv CC gcc
setenv CFLAGS '-O'
setenv CXX g++
setenv CXXFLAGS '-O'
setenv F77 'gfortran'
setenv FFLAGS '-O'

ompi_info -all:
attached

The application (named flow) was launched on hydra11 by
nohup mpirun -H hydra11,hydra12 -np 8 ./flow caseC.in &

the PATH and LD_LIBRARY_PATH, hydra11 and hydra12:
PATH=/home/ipl/openmpi-1.3.3/platforms/hp/bin
LD_LIBRARY_PATH= /home/ipl/openmpi-1.3.3/platforms/hp/lib

OpenFabrics version: 1.4

Linux:
X86_64-redhat-linux/3.4.6

ibv_devinfo, hydra11: attached
ibv_devinfo, hydra12: attached

ifconfig, hydra11: attached
ifconfig, hydra12: attached

ulimit -l (hydra11): 600
ulimit -l (hydra12): unlimited

Furthermore, I can say that I have not specified any MCA parameters.

The application which I am running  (named flow) is linked from fortran, c and 
c++ libraries with the following:
/home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicc-DMP -DNS3_ARCH_LINUX 
-DLAPACK  -I/home/ipl/ns3/engine/include_forLinux 
-I/home/ipl/openmpi-1.3.3/platforms/hp/include-c -o user_small_3D.o 
user_small_3D.c
rm -f flow
/home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicxx   -o flow  user_small_3D.o  
-L/home/ipl/ns3/engine/lib_forLinux -lns3main -lns3pars -lns3util -lns3vofl 
-lns3turb -lns3solv -lns3mesh -lns3diff -lns3grid -lns3line -lns3data -lns3base 
-lfitpack -lillusolve -lfftpack_small -lfenton -lns3air -lns3dens -lns3poro 
-lns3sedi -llapack_small -lblas_small -lm -lgfortran  
/home/ipl/ns3/engine/lib_Tecplot_forLinux/tecio64.a

Please let me know if you need more info!

Thanks in advance,
Iris Lohmann





Iris Pernille Lohmann

MSc, PhD

Ports & Offshore Technology (POT)



[cid:image001.gif@01CA564A.A05EDAA0]



DHI

Agern Allé 5

DK-2970 Hørsholm

Denmark



Tel:



+45 4516 9200

Direct:



45169427



i...@dhigroup.com

www.dhigroup.com



WATER  *  ENVIRONMENT  *  HEALTH





*
** **
** WARNING:  This email contains an attachment of a very suspicious type.  **
** You are urged NOT to open this attachment unless you are absolutely **
** sure it is legitimate.  Opening this attachment may cause irreparable   **
** damage to your computer and your files.  If you have any questions  **
** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
** **
** This warning was added by the IU Computer Science Dept. mail scanner.   **
*


<>
<>


ibv_devinfo_hydra11.out
Description: ibv_devinfo_hydra11.out


ibv_devinfo_hydra12.out
Description: ibv_devinfo_hydra12.out


ifconfig_hydra11.out
Description: ifconfig_hydra11.out


ifconfig_hydra12.out
Description: ifconfig_hydra12.out


Re: [OMPI users] MPI-3 Fortran feedback

2009-10-26 Thread Jeff Squyres

On Oct 25, 2009, at 11:38 PM, Steve Kargl wrote:


There is currently a semi-heated debate in comp.lang.fortran
concerning co-arrays and the upcoming Fortran 2008.  Don't
waste your time trying to decipher the thread; however, there
appear to be a few knowledgable MPI Fortaners hang-out, lately.
Would Craig mind if I relay the above to c.l.f.?  Of course,
if Craig prefers not to veer into USENET, I can understand
his decision.




The more feedback that we get, the better -- I don't have the cycles  
to read usenet, unfortunately.  I don't know if Craig does (but I  
suspect that he does not).  If they can reply here, on the blog post,  
or directly on the MPI-3 Fortran working group mailing list (linked to  
on the blog), that would be awesome.


Thanks!

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] bug in MPI_Cart_create?

2009-10-26 Thread Jeff Squyres
I can confirm that it is fixed on both the trunk and will be included  
in the upcoming 1.3.4 release.  The code now reads:


re_order = (0 == reorder)? false : true;

Thanks for the heads-up!


On Oct 26, 2009, at 6:40 AM, Kiril Dichev wrote:


Hi David,

I believe this particular bug was fixed in the trunk some weeks ago
shortly before your post.

Regards,
Kiril

On Tue, 2009-10-13 at 17:54 +1100, David Singleton wrote:
> Looking back through the archives, a lot of people have hit error
> messages like
>
>  > [bl302:26556] *** An error occurred in MPI_Cart_create
>  > [bl302:26556] *** on communicator MPI_COMM_WORLD
>  > [bl302:26556] *** MPI_ERR_ARG: invalid argument of some other  
kind
>  > [bl302:26556] *** MPI_ERRORS_ARE_FATAL (your MPI job will now  
abort)

>
> One of the reasons people *may* be hitting this is what I believe to
> be an incorrect test in MPI_Cart_create():
>
>  if (0 > reorder || 1 < reorder) {
>  return OMPI_ERRHANDLER_INVOKE (old_comm, MPI_ERR_ARG,
>FUNC_NAME);
>  }
>
> reorder is a "logical" argument and "2.5.2 C bindings" in the MPI  
1.3

> standard says:
>
>  Logical flags are integers with value 0 meaning “false” and a
>  non-zero value meaning “true.”
>
> So I'm not sure there should be any argument test.
>
>
> We hit this because we (sorta erroneously) were trying to use a  
GNU build
> of Open MPI with Intel compilers.  gfortran has true=1 while ifort  
has
> true=-1.  It seems to all work (by luck, I know) except this  
test.  Are

> there any other tests like this in Open MPI?
>
> David
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Dipl.-Inf. Kiril Dichev
Tel.: +49 711 685 60492
E-mail: dic...@hlrs.de
High Performance Computing Center Stuttgart (HLRS)
Universität Stuttgart
70550 Stuttgart
Germany


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com




Re: [OMPI users] MPI-3 Fortran feedback

2009-10-26 Thread Steve Kargl
On Fri, Oct 23, 2009 at 08:53:01AM -0400, Jeff Squyres wrote:
> If you're a Fortran MPI developer, I have a question for you.
> 
> In the MPI-3 Forum, we're working on revamping the Fortran bindings to  
> be "better" (for a variety of definitions of "better").  There's at  
> least one question that we really need some feedback from the MPI  
> Fortran developer community before proceeding.  Craig Rasmussen from  
> Los Alamos National Laboratory, chair of the MPI-3 Fortran Working  
> Group, asked me to post a "request for information" to my blog and  
> pass on the URL to every Fortran MPI programmer that I know:
> 
>   
> http://blogs.cisco.com/ciscotalk/performance/comments/mpi-3_fortran_community_feedback_needed/
> 
> Please go read that entry and let us know what you think.
> 
> Many thanks!
> 

Jeff,

There is currently a semi-heated debate in comp.lang.fortran
concerning co-arrays and the upcoming Fortran 2008.  Don't
waste your time trying to decipher the thread; however, there
appear to be a few knowledgable MPI Fortaners hang-out, lately.
Would Craig mind if I relay the above to c.l.f.?  Of course,
if Craig prefers not to veer into USENET, I can understand
his decision.

-- 
Steve