Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
No, I don't really have any idea what it's doing. You have to add manual sleeps and attach another instance of gdb if you want to step through the child, since I also can't get follow-fork-mode child to work. I only tried that once, and it segfaulted accessing the zero pointer destroying the stack in the process. I haven't done manual stepping or printf searches, so I don't know where in MPI_Init it breaks. I'm trying to install gdb 7.5.1 now (a nontrivial process on Mac, unfortunately) to see if that fixes it. Geoffrey On Thu, Feb 21, 2013 at 4:28 PM, Ralph Castainwrote: > Hmmm...how about that? Yeah, it doesn't work with the devel trunk either - > I'd missed that point. > > No idea why, I'm afraid - never tried it before. Are you sure it "crashes"? > I'm still getting a child status of "0", but no message output. My guess is > that the I/O is being lost for some reason. > > On Feb 21, 2013, at 4:09 PM, Geoffrey Irving wrote: > >> Actually, I don't see it printing "We're an MPI program!" under gdb, >> which means it isn't working. >> >> Geoffrey >> >> On Thu, Feb 21, 2013 at 4:07 PM, Ralph Castain wrote: >>> Hmmm...works with 1.6.4 for me on Mac 10.8.2: >>> >>> Ralphs-iMac:v1.6 rhc$ ./fork-bug >>> We're an MPI program! >>> child status = 0 >>> Ralphs-iMac:v1.6 rhc$ gdb ./fork-bug >>> GNU gdb 6.3.50-20050815 (Apple version gdb-1820) (Sat Jun 16 02:40:11 UTC >>> 2012) >>> Copyright 2004 Free Software Foundation, Inc. >>> GDB is free software, covered by the GNU General Public License, and you are >>> welcome to change it and/or distribute copies of it under certain >>> conditions. >>> Type "show copying" to see the conditions. >>> There is absolutely no warranty for GDB. Type "show warranty" for details. >>> This GDB was configured as "x86_64-apple-darwin"...Reading symbols for >>> shared libraries ... done >>> >>> (gdb) r >>> Starting program: /Volumes/RHCHD/rhc/openmpi/v1.6/fork-bug >>> Reading symbols for shared libraries ++. done >>> child status = 0 >>> >>> Program exited normally. >>> (gdb) >>> >>> Afraid I have no idea why you might be hitting the problem, though... >>> >>> >>> On Feb 21, 2013, at 3:50 PM, Geoffrey Irving wrote: >>> The singleton fork/exec itself is fine, since normal MPI programs work under gdb (e.g., fork-bug.c without the fork). gdb is has follow-fork-mode set to parent, so it's odd that gdb is looking at the child process's trickery at all. I've confirmed that it's still broken under 1.6.4, unfortunately. Geoffrey On Thu, Feb 21, 2013 at 3:36 PM, Ralph Castain wrote: > Singletons fork/exec a daemon to support them - my guess is that gdb may > not like it on your machine? > > FWIW - it runs fine for me using the developer's trunk. You might try > with 1.6.4 in case it's a bug in 1.6.0 > > > On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: > >> The attached program illustrates the problem. It forks, and the child >> calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, >> MPI_Init silently crashes. >> >> I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program >> directly, not through mpirun. >> >> Any ideas what might be wrong? >> >> Thanks, >> Geoffrey >> >> cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c >> cone:scratch% ./fork-bug >> We're an MPI program! >> child status = 0 >> cone:scratch% gdb ./fork-bug >> gdb ./fork-bug >> GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 >> UTC 2012) >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you >> are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for >> details. >> This GDB was configured as "x86_64-apple-darwin"...Reading symbols for >> shared libraries ... done >> >> (gdb) run >> Reading symbols for shared libraries ++. done >> child status = 5 >> >> Program exited normally. >> (gdb) cone:scratch% >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___
Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
Hmmm...how about that? Yeah, it doesn't work with the devel trunk either - I'd missed that point. No idea why, I'm afraid - never tried it before. Are you sure it "crashes"? I'm still getting a child status of "0", but no message output. My guess is that the I/O is being lost for some reason. On Feb 21, 2013, at 4:09 PM, Geoffrey Irvingwrote: > Actually, I don't see it printing "We're an MPI program!" under gdb, > which means it isn't working. > > Geoffrey > > On Thu, Feb 21, 2013 at 4:07 PM, Ralph Castain wrote: >> Hmmm...works with 1.6.4 for me on Mac 10.8.2: >> >> Ralphs-iMac:v1.6 rhc$ ./fork-bug >> We're an MPI program! >> child status = 0 >> Ralphs-iMac:v1.6 rhc$ gdb ./fork-bug >> GNU gdb 6.3.50-20050815 (Apple version gdb-1820) (Sat Jun 16 02:40:11 UTC >> 2012) >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "x86_64-apple-darwin"...Reading symbols for >> shared libraries ... done >> >> (gdb) r >> Starting program: /Volumes/RHCHD/rhc/openmpi/v1.6/fork-bug >> Reading symbols for shared libraries ++. done >> child status = 0 >> >> Program exited normally. >> (gdb) >> >> Afraid I have no idea why you might be hitting the problem, though... >> >> >> On Feb 21, 2013, at 3:50 PM, Geoffrey Irving wrote: >> >>> The singleton fork/exec itself is fine, since normal MPI programs work >>> under gdb (e.g., fork-bug.c without the fork). gdb is has >>> follow-fork-mode set to parent, so it's odd that gdb is looking at the >>> child process's trickery at all. >>> >>> I've confirmed that it's still broken under 1.6.4, unfortunately. >>> >>> Geoffrey >>> >>> On Thu, Feb 21, 2013 at 3:36 PM, Ralph Castain wrote: Singletons fork/exec a daemon to support them - my guess is that gdb may not like it on your machine? FWIW - it runs fine for me using the developer's trunk. You might try with 1.6.4 in case it's a bug in 1.6.0 On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: > The attached program illustrates the problem. It forks, and the child > calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, > MPI_Init silently crashes. > > I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program > directly, not through mpirun. > > Any ideas what might be wrong? > > Thanks, > Geoffrey > > cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c > cone:scratch% ./fork-bug > We're an MPI program! > child status = 0 > cone:scratch% gdb ./fork-bug > gdb ./fork-bug > GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC > 2012) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "x86_64-apple-darwin"...Reading symbols for > shared libraries ... done > > (gdb) run > Reading symbols for shared libraries ++. done > child status = 5 > > Program exited normally. > (gdb) cone:scratch% > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
Actually, I don't see it printing "We're an MPI program!" under gdb, which means it isn't working. Geoffrey On Thu, Feb 21, 2013 at 4:07 PM, Ralph Castainwrote: > Hmmm...works with 1.6.4 for me on Mac 10.8.2: > > Ralphs-iMac:v1.6 rhc$ ./fork-bug > We're an MPI program! > child status = 0 > Ralphs-iMac:v1.6 rhc$ gdb ./fork-bug > GNU gdb 6.3.50-20050815 (Apple version gdb-1820) (Sat Jun 16 02:40:11 UTC > 2012) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared > libraries ... done > > (gdb) r > Starting program: /Volumes/RHCHD/rhc/openmpi/v1.6/fork-bug > Reading symbols for shared libraries ++. done > child status = 0 > > Program exited normally. > (gdb) > > Afraid I have no idea why you might be hitting the problem, though... > > > On Feb 21, 2013, at 3:50 PM, Geoffrey Irving wrote: > >> The singleton fork/exec itself is fine, since normal MPI programs work >> under gdb (e.g., fork-bug.c without the fork). gdb is has >> follow-fork-mode set to parent, so it's odd that gdb is looking at the >> child process's trickery at all. >> >> I've confirmed that it's still broken under 1.6.4, unfortunately. >> >> Geoffrey >> >> On Thu, Feb 21, 2013 at 3:36 PM, Ralph Castain wrote: >>> Singletons fork/exec a daemon to support them - my guess is that gdb may >>> not like it on your machine? >>> >>> FWIW - it runs fine for me using the developer's trunk. You might try with >>> 1.6.4 in case it's a bug in 1.6.0 >>> >>> >>> On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: >>> The attached program illustrates the problem. It forks, and the child calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, MPI_Init silently crashes. I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program directly, not through mpirun. Any ideas what might be wrong? Thanks, Geoffrey cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c cone:scratch% ./fork-bug We're an MPI program! child status = 0 cone:scratch% gdb ./fork-bug gdb ./fork-bug GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC 2012) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done (gdb) run Reading symbols for shared libraries ++. done child status = 5 Program exited normally. (gdb) cone:scratch% ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
Hmmm...works with 1.6.4 for me on Mac 10.8.2: Ralphs-iMac:v1.6 rhc$ ./fork-bug We're an MPI program! child status = 0 Ralphs-iMac:v1.6 rhc$ gdb ./fork-bug GNU gdb 6.3.50-20050815 (Apple version gdb-1820) (Sat Jun 16 02:40:11 UTC 2012) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done (gdb) r Starting program: /Volumes/RHCHD/rhc/openmpi/v1.6/fork-bug Reading symbols for shared libraries ++. done child status = 0 Program exited normally. (gdb) Afraid I have no idea why you might be hitting the problem, though... On Feb 21, 2013, at 3:50 PM, Geoffrey Irvingwrote: > The singleton fork/exec itself is fine, since normal MPI programs work > under gdb (e.g., fork-bug.c without the fork). gdb is has > follow-fork-mode set to parent, so it's odd that gdb is looking at the > child process's trickery at all. > > I've confirmed that it's still broken under 1.6.4, unfortunately. > > Geoffrey > > On Thu, Feb 21, 2013 at 3:36 PM, Ralph Castain wrote: >> Singletons fork/exec a daemon to support them - my guess is that gdb may not >> like it on your machine? >> >> FWIW - it runs fine for me using the developer's trunk. You might try with >> 1.6.4 in case it's a bug in 1.6.0 >> >> >> On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: >> >>> The attached program illustrates the problem. It forks, and the child >>> calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, >>> MPI_Init silently crashes. >>> >>> I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program >>> directly, not through mpirun. >>> >>> Any ideas what might be wrong? >>> >>> Thanks, >>> Geoffrey >>> >>> cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c >>> cone:scratch% ./fork-bug >>> We're an MPI program! >>> child status = 0 >>> cone:scratch% gdb ./fork-bug >>> gdb ./fork-bug >>> GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC >>> 2012) >>> Copyright 2004 Free Software Foundation, Inc. >>> GDB is free software, covered by the GNU General Public License, and you are >>> welcome to change it and/or distribute copies of it under certain >>> conditions. >>> Type "show copying" to see the conditions. >>> There is absolutely no warranty for GDB. Type "show warranty" for details. >>> This GDB was configured as "x86_64-apple-darwin"...Reading symbols for >>> shared libraries ... done >>> >>> (gdb) run >>> Reading symbols for shared libraries ++. done >>> child status = 5 >>> >>> Program exited normally. >>> (gdb) cone:scratch% >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
The singleton fork/exec itself is fine, since normal MPI programs work under gdb (e.g., fork-bug.c without the fork). gdb is has follow-fork-mode set to parent, so it's odd that gdb is looking at the child process's trickery at all. I've confirmed that it's still broken under 1.6.4, unfortunately. Geoffrey On Thu, Feb 21, 2013 at 3:36 PM, Ralph Castainwrote: > Singletons fork/exec a daemon to support them - my guess is that gdb may not > like it on your machine? > > FWIW - it runs fine for me using the developer's trunk. You might try with > 1.6.4 in case it's a bug in 1.6.0 > > > On Feb 21, 2013, at 3:18 PM, Geoffrey Irving wrote: > >> The attached program illustrates the problem. It forks, and the child >> calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, >> MPI_Init silently crashes. >> >> I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program >> directly, not through mpirun. >> >> Any ideas what might be wrong? >> >> Thanks, >> Geoffrey >> >> cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c >> cone:scratch% ./fork-bug >> We're an MPI program! >> child status = 0 >> cone:scratch% gdb ./fork-bug >> gdb ./fork-bug >> GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC >> 2012) >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "x86_64-apple-darwin"...Reading symbols for >> shared libraries ... done >> >> (gdb) run >> Reading symbols for shared libraries ++. done >> child status = 5 >> >> Program exited normally. >> (gdb) cone:scratch% >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi MPI_Init doesn't work after fork under gdb
Singletons fork/exec a daemon to support them - my guess is that gdb may not like it on your machine? FWIW - it runs fine for me using the developer's trunk. You might try with 1.6.4 in case it's a bug in 1.6.0 On Feb 21, 2013, at 3:18 PM, Geoffrey Irvingwrote: > The attached program illustrates the problem. It forks, and the child > calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, > MPI_Init silently crashes. > > I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program > directly, not through mpirun. > > Any ideas what might be wrong? > > Thanks, > Geoffrey > > cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c > cone:scratch% ./fork-bug > We're an MPI program! > child status = 0 > cone:scratch% gdb ./fork-bug > gdb ./fork-bug > GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC > 2012) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "x86_64-apple-darwin"...Reading symbols for > shared libraries ... done > > (gdb) run > Reading symbols for shared libraries ++. done > child status = 5 > > Program exited normally. > (gdb) cone:scratch% > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] openmpi MPI_Init doesn't work after fork under gdb
The attached program illustrates the problem. It forks, and the child calls MPI_Init. This works fine unless I'm inside gdb. Inside gdb, MPI_Init silently crashes. I'm using OpenMPI 1.6.0 on Mac 10.8.2. I'm running the program directly, not through mpirun. Any ideas what might be wrong? Thanks, Geoffrey cone:scratch% /usr/local/bin/mpicc -o fork-bug fork-bug.c cone:scratch% ./fork-bug We're an MPI program! child status = 0 cone:scratch% gdb ./fork-bug gdb ./fork-bug GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC 2012) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done (gdb) run Reading symbols for shared libraries ++. done child status = 5 Program exited normally. (gdb) cone:scratch% #include #include #include #include #include #include #define CAREFUL 0 #if CAREFUL int main(int argc, char** argv) { int pid = fork(); if (pid < 0) { printf("fork failed: %s\n",strerror(errno)); return 1; } if (pid) { // Parent int status; pid = wait(); if (pid < 0) { printf("wait failed: %s\n",strerror(errno)); return 1; } printf("child status = %d\n",status); } else { // Child int r = MPI_Init(,); if (r != MPI_SUCCESS) { printf("MPI_Init failed: r = %d\n",r); return 1; } printf("We're an MPI program!\n"); r = MPI_Finalize(); if (r != MPI_SUCCESS) { printf("MPI_Finalized failed: r = %d\n",r); return 1; } } // Everything worked return 0; } #else int main(int argc, char** argv) { int pid = fork(); if (pid) { // Parent int status; wait(); printf("child status = %d\n",status); } else { // Child MPI_Init(,); printf("We're an MPI program!\n"); MPI_Finalize(); } return 0; } #endif
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
If someone wants to submit a patch and/or make 1.6.4 binaries, we could move forward with that. Please do so on the devel list, however -- not the users list (we've been a bit sloppy about separating users/devel recently; let's try to be better). On Feb 21, 2013, at 5:08 PM, Damien Hockingwrote: > Found it. The MPI::Datatype class isn't exported in a Win dll (no dllexport > wrappers on the class), so on a shared-libs build it's not in the library > symbols for anything else to see. The Windows CMAKE "BUILD_SHARED_LIBS" > option is therefore busted. On a static lib build everything's in there, a > dumpbin shows all the MPI::Datatype symbols. Those symbols are missing all > the way back into 1.5 shared-lib builds as well. > > Damien > > On 21/02/2013 12:19 PM, Jeff Squyres (jsquyres) wrote: >> On Feb 21, 2013, at 10:59 AM, Damien Hocking wrote: >> >>> Well this is interesting. The linker can't find that because >>> MPI::Datatype::Free isn't implemented on the Windows build (in >>> datatype_inln.h). It's declared in datatype.h though. It's not there in >>> the Linux version either, so I don't know where the Linux build is getting >>> that symbol from, that link should fail too. Is the C++ version of OpenMPI >>> actually broken overall? >> It's implemented in Datatype.cc. I'm don't remember offhand why we didn't >> put it in the inline versions. But it's definitely in the generated >> libmpi_cxx.so: >> >> -- >> % nm -C libmpi_cxx.so | grep MPI::Datatype::Free >> 00016ed8 T MPI::Datatype::Free() >> % >> - >> > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
Found it. The MPI::Datatype class isn't exported in a Win dll (no dllexport wrappers on the class), so on a shared-libs build it's not in the library symbols for anything else to see. The Windows CMAKE "BUILD_SHARED_LIBS" option is therefore busted. On a static lib build everything's in there, a dumpbin shows all the MPI::Datatype symbols. Those symbols are missing all the way back into 1.5 shared-lib builds as well. Damien On 21/02/2013 12:19 PM, Jeff Squyres (jsquyres) wrote: On Feb 21, 2013, at 10:59 AM, Damien Hockingwrote: Well this is interesting. The linker can't find that because MPI::Datatype::Free isn't implemented on the Windows build (in datatype_inln.h). It's declared in datatype.h though. It's not there in the Linux version either, so I don't know where the Linux build is getting that symbol from, that link should fail too. Is the C++ version of OpenMPI actually broken overall? It's implemented in Datatype.cc. I'm don't remember offhand why we didn't put it in the inline versions. But it's definitely in the generated libmpi_cxx.so: -- % nm -C libmpi_cxx.so | grep MPI::Datatype::Free 00016ed8 T MPI::Datatype::Free() % -
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
On Feb 21, 2013, at 10:59 AM, Damien Hockingwrote: > Well this is interesting. The linker can't find that because > MPI::Datatype::Free isn't implemented on the Windows build (in > datatype_inln.h). It's declared in datatype.h though. It's not there in the > Linux version either, so I don't know where the Linux build is getting that > symbol from, that link should fail too. Is the C++ version of OpenMPI > actually broken overall? It's implemented in Datatype.cc. I'm don't remember offhand why we didn't put it in the inline versions. But it's definitely in the generated libmpi_cxx.so: -- % nm -C libmpi_cxx.so | grep MPI::Datatype::Free 00016ed8 T MPI::Datatype::Free() % - -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
More or less. There's just not enough critical mass to keep it going. Damien Sent from my android device. -Original Message- From: "Hartman, Todd W."To: 'Open MPI Users' Sent: Thu, 21 Feb 2013 10:13 AM Subject: Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
On Feb 21, 2013, at 9:13 AM, "Hartman, Todd W."wrote: > Gee, that's too bad. I assumed that the 1.6.4 Windows build was delayed > because it was a lower priority. Do you suppose this position was taken > because there are no developers wishing to keep it alive? Afraid that is true. However, the cygwin folks have provided a cygwin package, so there is at least an option that work on Windows. > > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Damien Hocking > Sent: Thursday, February 21, 2013 10:00 AM > To: Open MPI Users > Subject: Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for > MPI::Datatype::Free > > Well this is interesting. The linker can't find that because > MPI::Datatype::Free isn't implemented on the Windows build (in > datatype_inln.h). It's declared in datatype.h though. It's not there in > the Linux version either, so I don't know where the Linux build is getting > that symbol from, that link should fail too. Is the C++ version of OpenMPI > actually broken overall? > > The Windows support is another issue. I think it's semi-officially > deprecated. > > Damien > > On 20/02/2013 11:20 PM, Hartman, Todd W. wrote: >> I'm trying to build a simple Open MPI application for Windows. I've > installed the binaries for OpenMPI-v1.6.2 (64-bit). I've also installed > Visual Studio 2010. The machine(s) are Windows 7 x64. >> >> >> When I attempt to compile a simple program that uses MPI::Send(), I get a > linker error saying that it cannot resolve MPI::Datatype::Free(). >> >> Here's a minimal example: >> >> --- >> #include >> #include >> int main( int argc, char** argv ) { >> MPI::Init(argc,argv); >> >> // Meant to run with 2 processes. >> if (MPI::COMM_WORLD.Get_rank() == 0) { >> int data; >> MPI::COMM_WORLD.Recv(,1,MPI_INT,1,0); >> std::cout << "received " << data << std::endl; >> } else { >> int data = 0xdead; >> std::cout << "sending " << data << std::endl; >> MPI::COMM_WORLD.Send(,1,MPI_INT,0,0); >> } >> >> MPI::Finalize(); >> } >> --- >> >> When I compile it: >> >> mpic++ send_compile.cpp -o send_compile.exe -DOMPI_IMPORTS >> mpic++ -DOPAL_IMPORTS -DORTE_IMPORTS >> >> >> --- >> Microsoft (R) C/C++ Optimizing Compiler Version 16.00.40219.01 for x64 >> Copyright (C) Microsoft Corporation. All rights reserved. >> >> >> >> >> cl : Command line warning D9035 : option 'o' has been deprecated and >> will be removed in a future release send_compile.cpp Microsoft (R) >> Incremental Linker Version 10.00.40219.01 Copyright (C) Microsoft >> Corporation. All rights reserved. >> >> /out:send_compile.exe >> /out:send_compile.exe >> "/LIBPATH:C:\Program Files (x86)\OpenMPI_v1.6.2-x64/lib" >> libmpi_cxx.lib >> libmpi.lib >> libopen-pal.lib >> libopen-rte.lib >> advapi32.lib >> Ws2_32.lib >> shlwapi.lib >> send_compile.obj >> send_compile.obj : error LNK2001: unresolved external symbol "public: >> virtual void __cdecl MPI::Datatype::Free(void)" >> (?Free@Datatype@MPI@@UEAAXXZ) send_compile.exe : fatal error LNK1120: >> 1 unresolved externals >> --- >> >> This program compiles and runs without complaint on an Ubuntu machine > around here. I don't know what the problem is. Open MPI's documentation > didn't say anything about adding the CPP defines (OMPI_IMPORTS, > OPAL_IMPORTS, ORTE_IMPORTS) whose absence were causing other linker errors > similar to this. Google found some items in the mailing list archive. I > cannot find any information about this particular problem, though. >> >> I tried using dumpbin to get symbols that were in the .lib files installed > by MPI, but didn't find any reference to that function name. I didn't find > any answers looking in the MPI headers, either. >> >> I have a similar program in C that compiles and runs fine on this Windows > machine. I don't know what I'm doing wrong with C++. Can someone point me in > the right direction? Is there some documentation regarding getting things to > work on Windows? The release notes don't address this problem, and I can't > find any other documentation related to what might be different from *nix to > Windows (WRT to Open MPI). >> >> Thanks. >> >> >> todd. >> >> P.S. This is copied from a StackOverflow question I posted > (http://stackoverflow.com/questions/14988099/open-mpi-c-link-error-mpidataty > pefree-on-windows). Forgive the cross-posting. >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
Gee, that's too bad. I assumed that the 1.6.4 Windows build was delayed because it was a lower priority. Do you suppose this position was taken because there are no developers wishing to keep it alive? -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Damien Hocking Sent: Thursday, February 21, 2013 10:00 AM To: Open MPI Users Subject: Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free Well this is interesting. The linker can't find that because MPI::Datatype::Free isn't implemented on the Windows build (in datatype_inln.h). It's declared in datatype.h though. It's not there in the Linux version either, so I don't know where the Linux build is getting that symbol from, that link should fail too. Is the C++ version of OpenMPI actually broken overall? The Windows support is another issue. I think it's semi-officially deprecated. Damien On 20/02/2013 11:20 PM, Hartman, Todd W. wrote: > I'm trying to build a simple Open MPI application for Windows. I've installed the binaries for OpenMPI-v1.6.2 (64-bit). I've also installed Visual Studio 2010. The machine(s) are Windows 7 x64. > > > When I attempt to compile a simple program that uses MPI::Send(), I get a linker error saying that it cannot resolve MPI::Datatype::Free(). > > Here's a minimal example: > > --- > #include > #include > int main( int argc, char** argv ) { > MPI::Init(argc,argv); > > // Meant to run with 2 processes. > if (MPI::COMM_WORLD.Get_rank() == 0) { > int data; > MPI::COMM_WORLD.Recv(,1,MPI_INT,1,0); > std::cout << "received " << data << std::endl; > } else { > int data = 0xdead; > std::cout << "sending " << data << std::endl; > MPI::COMM_WORLD.Send(,1,MPI_INT,0,0); > } > > MPI::Finalize(); > } > --- > > When I compile it: > > mpic++ send_compile.cpp -o send_compile.exe -DOMPI_IMPORTS > mpic++ -DOPAL_IMPORTS -DORTE_IMPORTS > > > --- > Microsoft (R) C/C++ Optimizing Compiler Version 16.00.40219.01 for x64 > Copyright (C) Microsoft Corporation. All rights reserved. > > > > > cl : Command line warning D9035 : option 'o' has been deprecated and > will be removed in a future release send_compile.cpp Microsoft (R) > Incremental Linker Version 10.00.40219.01 Copyright (C) Microsoft > Corporation. All rights reserved. > > /out:send_compile.exe > /out:send_compile.exe > "/LIBPATH:C:\Program Files (x86)\OpenMPI_v1.6.2-x64/lib" > libmpi_cxx.lib > libmpi.lib > libopen-pal.lib > libopen-rte.lib > advapi32.lib > Ws2_32.lib > shlwapi.lib > send_compile.obj > send_compile.obj : error LNK2001: unresolved external symbol "public: > virtual void __cdecl MPI::Datatype::Free(void)" > (?Free@Datatype@MPI@@UEAAXXZ) send_compile.exe : fatal error LNK1120: > 1 unresolved externals > --- > > This program compiles and runs without complaint on an Ubuntu machine around here. I don't know what the problem is. Open MPI's documentation didn't say anything about adding the CPP defines (OMPI_IMPORTS, OPAL_IMPORTS, ORTE_IMPORTS) whose absence were causing other linker errors similar to this. Google found some items in the mailing list archive. I cannot find any information about this particular problem, though. > > I tried using dumpbin to get symbols that were in the .lib files installed by MPI, but didn't find any reference to that function name. I didn't find any answers looking in the MPI headers, either. > > I have a similar program in C that compiles and runs fine on this Windows machine. I don't know what I'm doing wrong with C++. Can someone point me in the right direction? Is there some documentation regarding getting things to work on Windows? The release notes don't address this problem, and I can't find any other documentation related to what might be different from *nix to Windows (WRT to Open MPI). > > Thanks. > > > todd. > > P.S. This is copied from a StackOverflow question I posted (http://stackoverflow.com/questions/14988099/open-mpi-c-link-error-mpidataty pefree-on-windows). Forgive the cross-posting. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
Well this is interesting. The linker can't find that because MPI::Datatype::Free isn't implemented on the Windows build (in datatype_inln.h). It's declared in datatype.h though. It's not there in the Linux version either, so I don't know where the Linux build is getting that symbol from, that link should fail too. Is the C++ version of OpenMPI actually broken overall? The Windows support is another issue. I think it's semi-officially deprecated. Damien On 20/02/2013 11:20 PM, Hartman, Todd W. wrote: I'm trying to build a simple Open MPI application for Windows. I've installed the binaries for OpenMPI-v1.6.2 (64-bit). I've also installed Visual Studio 2010. The machine(s) are Windows 7 x64. When I attempt to compile a simple program that uses MPI::Send(), I get a linker error saying that it cannot resolve MPI::Datatype::Free(). Here's a minimal example: --- #include #include int main( int argc, char** argv ) { MPI::Init(argc,argv); // Meant to run with 2 processes. if (MPI::COMM_WORLD.Get_rank() == 0) { int data; MPI::COMM_WORLD.Recv(,1,MPI_INT,1,0); std::cout << "received " << data << std::endl; } else { int data = 0xdead; std::cout << "sending " << data << std::endl; MPI::COMM_WORLD.Send(,1,MPI_INT,0,0); } MPI::Finalize(); } --- When I compile it: mpic++ send_compile.cpp -o send_compile.exe -DOMPI_IMPORTS -DOPAL_IMPORTS -DORTE_IMPORTS --- Microsoft (R) C/C++ Optimizing Compiler Version 16.00.40219.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release send_compile.cpp Microsoft (R) Incremental Linker Version 10.00.40219.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:send_compile.exe /out:send_compile.exe "/LIBPATH:C:\Program Files (x86)\OpenMPI_v1.6.2-x64/lib" libmpi_cxx.lib libmpi.lib libopen-pal.lib libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib send_compile.obj send_compile.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl MPI::Datatype::Free(void)" (?Free@Datatype@MPI@@UEAAXXZ) send_compile.exe : fatal error LNK1120: 1 unresolved externals --- This program compiles and runs without complaint on an Ubuntu machine around here. I don't know what the problem is. Open MPI's documentation didn't say anything about adding the CPP defines (OMPI_IMPORTS, OPAL_IMPORTS, ORTE_IMPORTS) whose absence were causing other linker errors similar to this. Google found some items in the mailing list archive. I cannot find any information about this particular problem, though. I tried using dumpbin to get symbols that were in the .lib files installed by MPI, but didn't find any reference to that function name. I didn't find any answers looking in the MPI headers, either. I have a similar program in C that compiles and runs fine on this Windows machine. I don't know what I'm doing wrong with C++. Can someone point me in the right direction? Is there some documentation regarding getting things to work on Windows? The release notes don't address this problem, and I can't find any other documentation related to what might be different from *nix to Windows (WRT to Open MPI). Thanks. todd. P.S. This is copied from a StackOverflow question I posted (http://stackoverflow.com/questions/14988099/open-mpi-c-link-error-mpidatatypefree-on-windows). Forgive the cross-posting. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI send recv confusion
2013/2/21 Gus Correa> two types are the same size, > but I wonder if somehow the two type names are interchangeable > in OpenMPI (I would guess they're not), > although declared > Hello, No, I didnt had to change that. They both work fine for me. Pradeep
Re: [OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel
On Thu, Feb 21, 2013 at 12:23:14PM +0100, Paul Kapinos wrote: The MTT-Parameter mess is well-known and the good solution is to set the MTT parameter high. In other case you never know what you will get - your application may hang, block the IB interface, run bit slower, run very slow... http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem As I wrote: I'm aware of this FAQ entries -but: you can't set the log_num_mtt parameter if you're using a Debian/vanilla kernel: the mlx4_core-module does not offer this parameter. MfG/Sincerely, Stefan Friedel -- IWR * 523 * INF 368 * 69120 Heidelberg T +49 6221 548240 * F +49 6221 545224 stefan.frie...@iwr.uni-heidelberg.de signature.asc Description: Digital signature
Re: [OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel
The MTT-Parameter mess is well-known and the good solution is to set the MTT parameter high. In other case you never know what you will get - your application may hang, block the IB interface, run bit slower, run very slow... http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem http://www.open-mpi.org/community/lists/devel/2012/08/11417.php http://montecarlo.vtt.fi/mtg/2012_Madrid/Hans_Hammer2.pdf On 02/21/13 11:53, Stefan Friedel wrote: Is there a way to tell openmpi-1.6.3 to use the ofed-module from vanilla kernel and not to rely on log_num_mtt for "do-we-have-enough-registred-mem" computation for Mellanox HCAs? Any other idea/hint? -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel
Good morning, I'm struggling with the setup of openmpi-1.6.3 on top of Debian wheezy/testing and mellanox/ofed/mlx4 memory pinning- cluster equipped with Mellanox HCAs MT26428, Debian 3.2.35-2 x86_64, 4x8core AMD Opteron 6212, 128G Memory. I'm aware of the FAQ entries about mlx4_core module parameters (log_num_mtt etc.) but the module in Debian kernels (resp. kernels from kernel.org up to recent 3.8) does not know anything about log_num_mtt. This parameter is only available in the OFED rpms for SLES/RHEL/OEL. Jobs started with the the default environment do fail (log_mtts_per_seg is a valid parameter in mxl4_core/Debian kernel and set to 3; log_num_mtt is not a valid parameter of mxl4_core and set to 20 in btl_openib.c, ...Your MPI job will continue, but may be behave poorly and/or hang..., a simple benchmark will run for hours instead of returning a result after a few minutes, on the same hardware -Debian Squeeze and openmpi-1.4.5- this job runs flawlessly) Is there a way to tell openmpi-1.6.3 to use the ofed-module from vanilla kernel and not to rely on log_num_mtt for "do-we-have-enough-registred-mem" computation for Mellanox HCAs? Any other idea/hint? MfG/Sincerely, Stefan Friedel -- IWR * 523 * INF 368 * 69120 Heidelberg T +49 6221 548240 * F +49 6221 545224 stefan.frie...@iwr.uni-heidelberg.de signature.asc Description: Digital signature
[OMPI users] Windows C++ Linker Error "unresolved symbol" for MPI::Datatype::Free
I'm trying to build a simple Open MPI application for Windows. I've installed the binaries for OpenMPI-v1.6.2 (64-bit). I've also installed Visual Studio 2010. The machine(s) are Windows 7 x64. When I attempt to compile a simple program that uses MPI::Send(), I get a linker error saying that it cannot resolve MPI::Datatype::Free(). Here's a minimal example: --- #include #include int main( int argc, char** argv ) { MPI::Init(argc,argv); // Meant to run with 2 processes. if (MPI::COMM_WORLD.Get_rank() == 0) { int data; MPI::COMM_WORLD.Recv(,1,MPI_INT,1,0); std::cout << "received " << data << std::endl; } else { int data = 0xdead; std::cout << "sending " << data << std::endl; MPI::COMM_WORLD.Send(,1,MPI_INT,0,0); } MPI::Finalize(); } --- When I compile it: mpic++ send_compile.cpp -o send_compile.exe -DOMPI_IMPORTS -DOPAL_IMPORTS -DORTE_IMPORTS --- Microsoft (R) C/C++ Optimizing Compiler Version 16.00.40219.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release send_compile.cpp Microsoft (R) Incremental Linker Version 10.00.40219.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:send_compile.exe /out:send_compile.exe "/LIBPATH:C:\Program Files (x86)\OpenMPI_v1.6.2-x64/lib" libmpi_cxx.lib libmpi.lib libopen-pal.lib libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib send_compile.obj send_compile.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl MPI::Datatype::Free(void)" (?Free@Datatype@MPI@@UEAAXXZ) send_compile.exe : fatal error LNK1120: 1 unresolved externals --- This program compiles and runs without complaint on an Ubuntu machine around here. I don't know what the problem is. Open MPI's documentation didn't say anything about adding the CPP defines (OMPI_IMPORTS, OPAL_IMPORTS, ORTE_IMPORTS) whose absence were causing other linker errors similar to this. Google found some items in the mailing list archive. I cannot find any information about this particular problem, though. I tried using dumpbin to get symbols that were in the .lib files installed by MPI, but didn't find any reference to that function name. I didn't find any answers looking in the MPI headers, either. I have a similar program in C that compiles and runs fine on this Windows machine. I don't know what I'm doing wrong with C++. Can someone point me in the right direction? Is there some documentation regarding getting things to work on Windows? The release notes don't address this problem, and I can't find any other documentation related to what might be different from *nix to Windows (WRT to Open MPI). Thanks. todd. P.S. This is copied from a StackOverflow question I posted (http://stackoverflow.com/questions/14988099/open-mpi-c-link-error-mpidatatypefree-on-windows). Forgive the cross-posting.