Re: [OMPI users] Isend, Recv and Test

2016-05-05 Thread Zhen Wang
Jeff, Thanks. Best regards, Zhen On Thu, May 5, 2016 at 8:45 PM, Jeff Squyres (jsquyres) wrote: > It's taking so long because you are sleeping for .1 second between calling > MPI_Test(). > > The TCP transport is only sending a few fragments of your message during > each

Re: [OMPI users] Isend, Recv and Test

2016-05-05 Thread Jeff Squyres (jsquyres)
It's taking so long because you are sleeping for .1 second between calling MPI_Test(). The TCP transport is only sending a few fragments of your message during each iteration through MPI_Test (because, by definition, it has to return "immediately"). Other transports do better handing off

Re: [OMPI users] Problems using 1.10.2 with MOFED 3.1-1.1.0.1

2016-05-05 Thread Andy Riebs
Sorry, my output listing was incomplete -- the program did run after the "No OpenFabrics" message, but (I presume) ran over Ethernet rather than InfiniBand. So I can't really say what was causing it to fail. Andy On 05/05/2016 06:09 PM, Nathan Hjelm

Re: [OMPI users] Isend, Recv and Test

2016-05-05 Thread Zhen Wang
2016-05-05 9:27 GMT-05:00 Gilles Gouaillardet : > Out of curiosity, can you try > mpirun --mca btl self,sm ... > Same as before. Many MPI_Test calls. > and > mpirun --mca btl self,vader ... > A requested component was not found, or was unable to be opened. This

Re: [OMPI users] Problems using 1.10.2 with MOFED 3.1-1.1.0.1

2016-05-05 Thread Nathan Hjelm
It should work fine with ob1 (the default). Did you determine what was causing it to fail? -Nathan On Thu, May 05, 2016 at 06:04:55PM -0400, Andy Riebs wrote: >For anyone like me who happens to google this in the future, the solution >was to set OMPI_MCA_pml=yalla > >Many thanks

Re: [OMPI users] Problems using 1.10.2 with MOFED 3.1-1.1.0.1

2016-05-05 Thread Andy Riebs
For anyone like me who happens to google this in the future, the solution was to set OMPI_MCA_pml=yalla Many thanks Josh! On 05/05/2016 12:52 PM, Joshua Ladd wrote: We are working with Andy offline.

Re: [OMPI users] Problems using 1.10.2 with MOFED 3.1-1.1.0.1

2016-05-05 Thread Joshua Ladd
We are working with Andy offline. Josh On Thu, May 5, 2016 at 7:32 AM, Andy Riebs wrote: > I've built 1.10.2 with all my favorite configuration options, but I get > messages such as this (one for each rank with orte_base_help_aggregate=0) > when I try to run on a MOFED

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Jeff Squyres (jsquyres)
Giacomo -- Are you able to run anything that is compiled by that Intel compiler installation? > On May 5, 2016, at 12:02 PM, Gus Correa wrote: > > Hi Giacomo > > Some programs fail with segmentation fault > because the stack size is too small. > [But others because

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Gus Correa
Hi Giacomo Some programs fail with segmentation fault because the stack size is too small. [But others because of bugs in memory allocation/management, etc.] Have you tried ulimit -s unlimited before you run the program? Are you using a single machine or a cluster? If you're using infiniband

Re: [OMPI users] Isend, Recv and Test

2016-05-05 Thread Gilles Gouaillardet
Out of curiosity, can you try mpirun --mca btl self,sm ... and mpirun --mca btl self,vader ... and see if one performs better than the other ? Cheers, Gilles On Thursday, May 5, 2016, Zhen Wang wrote: > Gilles, > > Thanks for your reply. > > Best regards, > Zhen > > On Wed,

Re: [OMPI users] [open-mpi/ompi] COMM_SPAWN broken on Solaris/v1.10 (#1569)

2016-05-05 Thread Siegmar Gross
Hi Gilles, is the following output helpful to find the error? I've put another output below the output from gdb, which shows that things are a little bit "random" if I use only 3+2 or 4+1 Sparc machines. tyr spawn 127 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec GNU gdb (GDB) 7.6.1 Copyright

Re: [OMPI users] Isend, Recv and Test

2016-05-05 Thread Zhen Wang
Gilles, Thanks for your reply. Best regards, Zhen On Wed, May 4, 2016 at 8:43 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Note there is no progress thread in openmpi 1.10 > from a pragmatic point of view, that means that for "large" messages, no > data is sent in

[OMPI users] Problems using 1.10.2 with MOFED 3.1-1.1.0.1

2016-05-05 Thread Andy Riebs
I've built 1.10.2 with all my favorite configuration options, but I get messages such as this (one for each rank with orte_base_help_aggregate=0) when I try to run on a MOFED system: $ shmemrun -H hades02,hades03 $PWD/shmem.out

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Giacomo Rossi
gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the

Re: [OMPI users] [open-mpi/ompi] COMM_SPAWN broken on Solaris/v1.10 (#1569)

2016-05-05 Thread Gilles Gouaillardet
Siegmar, is this Solaris 10 specific (e.g. Solaris 11 works fine) ( I only have a x86_64 vm with Solaris 11 and sun compilers ...) Cheers, Gilles On Thursday, May 5, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi Ralph and Gilles, > > Am 04.05.2016 um 20:02 schrieb

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Gilles Gouaillardet
Giacomo, one option is to (if your shell is bash) ulimit -c unlimited mpif90 -v you should get a core file an other option is to gdb /.../mpif90 r -v bt Cheers, Gilles On Thursday, May 5, 2016, Giacomo Rossi wrote: > Here the result of ldd command: > 'ldd

Re: [OMPI users] [open-mpi/ompi] COMM_SPAWN broken on Solaris/v1.10 (#1569)

2016-05-05 Thread Siegmar Gross
Hi Ralph and Gilles, Am 04.05.2016 um 20:02 schrieb rhc54: @ggouaillardet Where does this stand? — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Giacomo Rossi
Here the result of ldd command: 'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 linux-vdso.so.1 (0x7ffcacbbe000) libopen-pal.so.13 => /opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13 (0x7fa9597a9000) libm.so.6 => /usr/lib/libm.so.6 (0x7fa9594a4000) libpciaccess.so.0 =>

[OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-05-05 Thread Gilles Gouaillardet
Giacomo, could you also open the core file with gdb and post the backtrace ? can you also ldd mpif90 and confirm no intel MPI library is used ? btw, the OpenMPI fortran wrapper is now mpifort Cheers, Gilles On Thursday, May 5, 2016, Giacomo Rossi