[OMPI users] BTL layer

2010-09-22 Thread Gabriele Fatigati
Dear all, i'm tuning collectives of OpenMPI 1.4.2 with OTPO. I have a little question about BTL. This layer is involves just in point-to-point communication or also in collectives routines? Because i've noted that changing some blt parameters like btl_sm_eager_limit and doing one collective

Re: [OMPI users] [openib] segfault when using openib btl

2010-09-22 Thread Eloi Gaudry
Hi Nysal, Thanks for your suggestions. I'm now able to get the checksum computed and redirected to stdout, thanks (I forgot the "-mca pml_base_verbose 5" option, you were right). I haven't been able to observe the segmentation fault (with hdr->tag=0) so far (when using pml csum) but I 'll let

Re: [OMPI users] BTL layer

2010-09-22 Thread Jeff Squyres
On Sep 22, 2010, at 3:46 AM, Gabriele Fatigati wrote: > i'm tuning collectives of OpenMPI 1.4.2 with OTPO. I have a little question > about BTL. This layer is involves just in point-to-point communication or > also in collectives routines? > > Because i've noted that changing some blt

Re: [OMPI users] multipath support for infiniband

2010-09-22 Thread Jeff Squyres
Yes, check out the btl_openib_max_lmc MCA parameter: shell$ ompi_info --param btl openib --parsable | grep lmc On Sep 21, 2010, at 11:45 AM, Jens Domke wrote: > Hello, > > the InfiniBand architecture has a LMC feature to assign mutiple virtual LIDs > to one port and so provides multiple

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Jeff Squyres
This is a problem with the Pathscale compiler and old versions of GCC. See: http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c%2B%2B-api I note that you said you're already using GCC 4.x, but it's not clear from your text whether pathscale is using that compiler or

Re: [OMPI users] OpenMPI on the ARM processor architecture?

2010-09-22 Thread Jeff Squyres
On Sep 20, 2010, at 2:14 PM, Ken Mighell wrote: > Has there been any consideration of porting OpenMPI to the ARM processor? I don't believe that anyone is actively working on this, but I could be wrong. > Plans are afoot to launch 7 ARM processors on a "Stage Coach" card in a 3U > CubeSat.

Re: [OMPI users] multipath support for infiniband

2010-09-22 Thread Jens Domke
I already tried this parameter, but I don't see any improvements in the benchmarks. Additionally while doing further investigations into the opensm I didn't see the QP requests for other LIDs than the base LIDs. Regards Jens Jeff Squyres wrote: Yes, check out the btl_openib_max_lmc MCA

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
On Wed, 2010-09-22 at 07:42 -0400, Jeff Squyres wrote: > This is a problem with the Pathscale compiler and old versions of GCC. See: > > > http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c%2B%2B-api > > I note that you said you're already using GCC 4.x, but it's

Re: [OMPI users] multipath support for infiniband

2010-09-22 Thread Jeff Squyres
On Sep 22, 2010, at 8:04 AM, Jens Domke wrote: > I already tried this parameter, but I don't see any improvements in the > benchmarks. Additionally while doing further investigations into the opensm I > didn't see the QP requests for other LIDs than the base LIDs. I'm afraid that I'm not an

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
On Wed, 2010-09-22 at 14:16 +0200, Ake Sandgren wrote: > On Wed, 2010-09-22 at 07:42 -0400, Jeff Squyres wrote: > > This is a problem with the Pathscale compiler and old versions of GCC. See: > > > > > > http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c%2B%2B-api >

Re: [OMPI users] MPI_Reduce performance

2010-09-22 Thread Jeff Squyres
On Sep 9, 2010, at 4:31 PM, Ashley Pittman wrote: >> What is the exact semantics of an asynchronous barrier, > > I'm not sure of the exact semantics but once you've got your head around the > concept it's fairly simple to understand how to use it, you call > MPI_IBarrier() and it gives you a

Re: [OMPI users] Continued functionality across a SLES10 to SLES11 upgrade ...

2010-09-22 Thread Jeff Squyres
On Sep 20, 2010, at 1:20 PM, Richard Walsh wrote: > I was not expecting things to work, and find that codes compiled using > OpenMPI 1.4.1 commands under SLES 10.2 produce the following message > when run under SLES11: > > mca: base: component_find: unable to open >

Re: [OMPI users] Continued functionality across a SLES10 to SLES11 upgrade ...

2010-09-22 Thread Richard Walsh
Jeff Squyres wrote: >Probably your best bet would be: > >- investigate if there's a missing symbol or library in the current >mca_btl_openib.so (e.g., run nm on mca_btl_openib.so and ensure that all those >libraries are >present in SLES 11) >- if it's a missing library, see if you can

Re: [OMPI users] Continued functionality across a SLES10 to SLES11 upgrade ...

2010-09-22 Thread Jeff Squyres
On Sep 22, 2010, at 9:50 AM, Richard Walsh wrote: > The implication of your reply > is that if the symbols/libraries are all there then things should work. *IF* SLES provides binary compatibility guarantees between 10.2 and 11. If not, then it *may* work, and/or it may fail in mysterious

Re: [OMPI users] function fgets hangs a mpi program when it is used ompi-ps command

2010-09-22 Thread Jeff Squyres
Are you running on machines with OpenFabrics devices (that Open MPI is using)? Is ompi-ps printing 100 bytes or more? What does ps show when your program is hung? On Sep 17, 2010, at 3:13 PM, Matheus Bersot Siqueira Barros wrote: > Open MPI Version = 1.4.2 > OS = Ubuntu 10.04 LTS and CentOS

Re: [OMPI users] function fgets hangs a mpi program when it is used ompi-ps command

2010-09-22 Thread Ralph Castain
Printouts of less than 100 bytes would be unusual...but possible On Wed, Sep 22, 2010 at 8:15 AM, Jeff Squyres wrote: > Are you running on machines with OpenFabrics devices (that Open MPI is > using)? > > Is ompi-ps printing 100 bytes or more? > > What does ps show when your

Re: [OMPI users] OpenMPI on the ARM processor architecture?

2010-09-22 Thread Dave Love
Jeff Squyres writes: > I believe that the first step would be to get some assembly for the > ARM platform for some of OMPI's key routines (locks, atomics, etc.). > Beyond that, it *might* "just work"...? Is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579505

Re: [OMPI users] OpenMPI on the ARM processor architecture?

2010-09-22 Thread Jeff Squyres
Yes, the built-in GCC atomics might work. I don't know if anyone has tried them; they would be most useful because they would allow us to use multiple different platforms. Patches would definitely be appreciated here. On Sep 22, 2010, at 12:25 PM, Dave Love wrote: > Jeff Squyres

Re: [OMPI users] BTL layer

2010-09-22 Thread Gabriele Fatigati
Thanks Jeff, and.. what about RDMA? It works only with point-to-point or also with collectives? 2010/9/22 Jeff Squyres > On Sep 22, 2010, at 3:46 AM, Gabriele Fatigati wrote: > > > i'm tuning collectives of OpenMPI 1.4.2 with OTPO. I have a little > question about BTL. This

Re: [OMPI users] BTL layer

2010-09-22 Thread Jeff Squyres
On Sep 22, 2010, at 1:53 PM, Gabriele Fatigati wrote: > Thanks Jeff, > > and.. what about RDMA? It works only with point-to-point or also with > collectives? When collectives use the point-to-point sends (e.g., they effectively invoke MPI_SEND as part of MPI_BCAST, for example), that will do