Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Geoffrey Irving
No, I don't really have any idea what it's doing. You have to add manual sleeps and attach another instance of gdb if you want to step through the child, since I also can't get follow-fork-mode child to work. I only tried that once, and it segfaulted accessing the zero pointer destroying the

Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Ralph Castain
Hmmm...how about that? Yeah, it doesn't work with the devel trunk either - I'd missed that point. No idea why, I'm afraid - never tried it before. Are you sure it "crashes"? I'm still getting a child status of "0", but no message output. My guess is that the I/O is being lost for some reason.

Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Geoffrey Irving
Actually, I don't see it printing "We're an MPI program!" under gdb, which means it isn't working. Geoffrey On Thu, Feb 21, 2013 at 4:07 PM, Ralph Castain wrote: > Hmmm...works with 1.6.4 for me on Mac 10.8.2: > > Ralphs-iMac:v1.6 rhc$ ./fork-bug > We're an MPI program! >

Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Ralph Castain
Hmmm...works with 1.6.4 for me on Mac 10.8.2: Ralphs-iMac:v1.6 rhc$ ./fork-bug We're an MPI program! child status = 0 Ralphs-iMac:v1.6 rhc$ gdb ./fork-bug GNU gdb 6.3.50-20050815 (Apple version gdb-1820) (Sat Jun 16 02:40:11 UTC 2012) Copyright 2004 Free Software Foundation, Inc. GDB is free

Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Geoffrey Irving
The singleton fork/exec itself is fine, since normal MPI programs work under gdb (e.g., fork-bug.c without the fork). gdb is has follow-fork-mode set to parent, so it's odd that gdb is looking at the child process's trickery at all. I've confirmed that it's still broken under 1.6.4,

Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Ralph Castain
Singletons fork/exec a daemon to support them - my guess is that gdb may not like it on your machine? FWIW - it runs fine for me using the developer's trunk. You might try with 1.6.4 in case it's a bug in 1.6.0 On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: > The

[OMPI users] openmpi MPI_Init doesn't work after fork under gdb

2013-02-21 Thread Geoffrey Irving
The attached program illustrates the problem. It forks, and the child calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, MPI_Init silently crashes. I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program directly, not through mpirun. Any ideas what might be wrong?

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Jeff Squyres (jsquyres)
If someone wants to submit a patch and/or make 1.6.4 binaries, we could move forward with that. Please do so on the devel list, however -- not the users list (we've been a bit sloppy about separating users/devel recently; let's try to be better). On Feb 21, 2013, at 5:08 PM, Damien Hocking

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Damien Hocking
Found it. The MPI::Datatype class isn't exported in a Win dll (no dllexport wrappers on the class), so on a shared-libs build it's not in the library symbols for anything else to see. The Windows CMAKE "BUILD_SHARED_LIBS" option is therefore busted. On a static lib build everything's in

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Jeff Squyres (jsquyres)
On Feb 21, 2013, at 10:59 AM, Damien Hocking wrote: > Well this is interesting. The linker can't find that because > MPI::Datatype::Free isn't implemented on the Windows build (in > datatype_inln.h). It's declared in datatype.h though. It's not there in the > Linux

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread damien
More or less. There's just not enough critical mass to keep it going. Damien Sent from my android device. -Original Message- From: "Hartman, Todd W." To: 'Open MPI Users' Sent: Thu, 21 Feb 2013 10:13 AM Subject: Re: [OMPI users] Windows C++

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Ralph Castain
On Feb 21, 2013, at 9:13 AM, "Hartman, Todd W." wrote: > Gee, that's too bad. I assumed that the 1.6.4 Windows build was delayed > because it was a lower priority. Do you suppose this position was taken > because there are no developers wishing to keep it alive? Afraid that

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Hartman, Todd W.
Gee, that's too bad. I assumed that the 1.6.4 Windows build was delayed because it was a lower priority. Do you suppose this position was taken because there are no developers wishing to keep it alive? -Original Message- From: users-boun...@open-mpi.org

Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Damien Hocking
Well this is interesting. The linker can't find that because MPI::Datatype::Free isn't implemented on the Windows build (in datatype_inln.h). It's declared in datatype.h though. It's not there in the Linux version either, so I don't know where the Linux build is getting that symbol from,

Re: [OMPI users] MPI send recv confusion

2013-02-21 Thread Pradeep Jha
2013/2/21 Gus Correa > two types are the same size, > but I wonder if somehow the two type names are interchangeable > in OpenMPI (I would guess they're not), > although declared > Hello, No, I didnt had to change that. They both work fine for me. Pradeep

Re: [OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel

2013-02-21 Thread Stefan Friedel
On Thu, Feb 21, 2013 at 12:23:14PM +0100, Paul Kapinos wrote: The MTT-Parameter mess is well-known and the good solution is to set the MTT parameter high. In other case you never know what you will get - your application may hang, block the IB interface, run bit slower, run very slow...

Re: [OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel

2013-02-21 Thread Paul Kapinos
The MTT-Parameter mess is well-known and the good solution is to set the MTT parameter high. In other case you never know what you will get - your application may hang, block the IB interface, run bit slower, run very slow... http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem

[OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel

2013-02-21 Thread Stefan Friedel
Good morning, I'm struggling with the setup of openmpi-1.6.3 on top of Debian wheezy/testing and mellanox/ofed/mlx4 memory pinning- cluster equipped with Mellanox HCAs MT26428, Debian 3.2.35-2 x86_64, 4x8core AMD Opteron 6212, 128G Memory. I'm aware of the FAQ entries about mlx4_core module

[OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free

2013-02-21 Thread Hartman, Todd W.
I'm trying to build a simple Open MPI application for Windows. I've installed the binaries for OpenMPI-v1.6.2 (64-bit). I've also installed Visual Studio 2010. The machine(s) are Windows 7 x64. When I attempt to compile a simple program that uses MPI::Send(), I get a linker error saying that