Re: [OMPI users] Open MPI 1.2.4 verbosity w.r.t. osc pt2pt
On 12/12/07, Jeff Squyres wrote: > On Dec 12, 2007, at 6:32 PM, Lisandro Dalcin wrote: > > Do I have the libtool API calls available when linking against > > libmpi.so ? > > You should, yes. OK, but now I realize that I cannot simply call libtool dlopen() inconditionally, as libmpi.so could not exist in a static lib build. > Also, see my later post: doesn't perl/python have some kind of > portable dlopen anyway? They're opening your module...? AFAIK, Python does not. It uses specific, private code for this, handling the loading of extension modules according to the OS's and their idiosyncracies. However, Python enable users to change the flags used for dlopen'ing your extension modules; the tricky part is to get the correct values RTLD_GLOBAL in a portable way. > > Is there any another way of setting a MCA parameter? > See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params. OK, it seems there isn't a programatic way. Anyway, putenv() should not be a souce of portability problems. Jeff, once you have this parameter enabled, please write me in order I can do some testing. Many tanks for your clarifications and help. -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI users] parpack with openmpi
Yes, the software came with its own. And i removed it, mpif77 takes care of not having mpif.h in the directory just as it should. I should mention (sorry) that the single, complex and double complex examples work. only the double (real) examples fail. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 12, 2007, at 6:51 PM, Jeff Squyres wrote: This *usually* happens when you include the mpif.h from a different MPI implementation. Can you check that? On Dec 12, 2007, at 5:15 PM, Brock Palen wrote: Has anyone ever built parpack (http://www.caam.rice.edu/software/ ARPACK/) with openmpi? It compiles but some of the examples give: [nyx-login1.engin.umich.edu:12173] *** on communicator MPI_COMM_WORLD [nyx-login1.engin.umich.edu:12173] *** MPI_ERR_TYPE: invalid datatype [nyx-login1.engin.umich.edu:12173] *** MPI_ERRORS_ARE_FATAL (goodbye) [nyx-login1.engin.umich.edu:12174] *** An error occurred in MPI_Recv [nyx-login1.engin.umich.edu:12174] *** on communicator MPI_COMM_WORLD I checked all the data types are: MPI_DOUBLE_PRECISION Im not sure where to look next. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] parpack with openmpi
This *usually* happens when you include the mpif.h from a different MPI implementation. Can you check that? On Dec 12, 2007, at 5:15 PM, Brock Palen wrote: Has anyone ever built parpack (http://www.caam.rice.edu/software/ ARPACK/) with openmpi? It compiles but some of the examples give: [nyx-login1.engin.umich.edu:12173] *** on communicator MPI_COMM_WORLD [nyx-login1.engin.umich.edu:12173] *** MPI_ERR_TYPE: invalid datatype [nyx-login1.engin.umich.edu:12173] *** MPI_ERRORS_ARE_FATAL (goodbye) [nyx-login1.engin.umich.edu:12174] *** An error occurred in MPI_Recv [nyx-login1.engin.umich.edu:12174] *** on communicator MPI_COMM_WORLD I checked all the data types are: MPI_DOUBLE_PRECISION Im not sure where to look next. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Open MPI 1.2.4 verbosity w.r.t. osc pt2pt
On Dec 12, 2007, at 6:32 PM, Lisandro Dalcin wrote: Yes, this is problematic; dlopen is fun on all the various OS's... FWIW: we use the Libtool DL library for this kind of portability; OMPI itself doesn't have all the logic for the different OS loaders. Do I have the libtool API calls available when linking against libmpi.so ? You should, yes. Also, see my later post: doesn't perl/python have some kind of portable dlopen anyway? They're opening your module...? This should hypothetically allow you to do a simple putenv() before calling MPI_INIT and then the Right magic should occur. Is there any another way of setting a MCA parameter? Or playing with the environment is the only available way? See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params. -- Jeff Squyres Cisco Systems
Re: [OMPI users] Open MPI 1.2.4 verbosity w.r.t. osc pt2pt
On 12/12/07, Jeff Squyres wrote: > Yes, this is problematic; dlopen is fun on all the various OS's... > > FWIW: we use the Libtool DL library for this kind of portability; OMPI > itself doesn't have all the logic for the different OS loaders. Do I have the libtool API calls available when linking against libmpi.so ? > > Anyway, perhaps OpenMPI could provide an extension: a function call, > (after much thinking...) Perhaps a better solution would be an MCA > parameter: if the logical "mca_do_dlopen_hackery" Fine with this. > This should hypothetically allow you to do a simple putenv() before > calling MPI_INIT and then the Right magic should occur. Is there any another way of setting a MCA parameter? Or playing with the environment is the only available way? -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
[OMPI users] parpack with openmpi
Has anyone ever built parpack (http://www.caam.rice.edu/software/ ARPACK/) with openmpi? It compiles but some of the examples give: [nyx-login1.engin.umich.edu:12173] *** on communicator MPI_COMM_WORLD [nyx-login1.engin.umich.edu:12173] *** MPI_ERR_TYPE: invalid datatype [nyx-login1.engin.umich.edu:12173] *** MPI_ERRORS_ARE_FATAL (goodbye) [nyx-login1.engin.umich.edu:12174] *** An error occurred in MPI_Recv [nyx-login1.engin.umich.edu:12174] *** on communicator MPI_COMM_WORLD I checked all the data types are: MPI_DOUBLE_PRECISION Im not sure where to look next. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985
Re: [OMPI users] Dual ethernet & OpenMPI
You can specify which network interface should be used by Open MPI via the btl_tcp_if_include MCA parameter (using the interface name, e.g. eth0,eth1). You can even specify how the messages will be distributed between the networks (please read the FAQ for more info about this). To test that you doubled your bandwidth use any point-to-point benchmark such as NetPIPE. Thanks, george. On Dec 12, 2007, at 1:42 PM, Michael wrote: In the past I configured a Linux cluster by bonding two ethernet ports together on each node (with the master having a third port of outside communication); however, recent discussions seem to say that if I have two ethernet cards OpenMPI can handle all the setup itself. My question is what address ranges should I use, that is, should both ports be on the same network range, i.e. 10.0.0.x/255.255.255.0, or should they be on separate network ranges, i.e. 10.0.0.x/255.255.255.0 and 10.0.1.x/255.255.255.0. Would I need a third ethernet card for outside communication or could one port on the master node handle both internal and external communications. Would there be any special flags to set this up or would OpenMPI detect the two paths -- obviously each port would have a different IP address if I'm not using bonding so do you just double the host list? How would I test if I have doubled my bandwidth? Michael ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI users] Problems with GATHERV on one process
Thanks Tim. I've since noticed similar problems with MPI_Allgatherv and MPI_Scatterv. I'm guessing they are all related. Do you happen to know if those are being fixed as well? -Ken > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Tim Mattox > Sent: Tuesday, December 11, 2007 3:34 PM > To: Open MPI Users > Subject: Re: [OMPI users] Problems with GATHERV on one process > > Hello Ken, > This is a known bug, which is fixed in the upcoming 1.2.5 release. We > expect 1.2.5 > to come out very soon. We should have a new release candidate for 1.2.5 > posted > by tomorrow. > > See these tickets about the bug if you care to look: > https://svn.open-mpi.org/trac/ompi/ticket/1166 > https://svn.open-mpi.org/trac/ompi/ticket/1157 > > On Dec 11, 2007 2:48 PM, Moreland, Kenneth wrote: > > I recently ran into a problem with GATHERV while running some randomized > > tests on my MPI code. The problem seems to occur when running > > MPI_Gatherv with a displacement on a communicator with a single process. > > The code listed below exercises this errant behavior. I have tried it > > on OpenMPI 1.1.2 and 1.2.4. > > > > Granted, this is not a situation that one would normally run into in a > > real application, but I just wanted to check to make sure I was not > > doing anything wrong. > > > > -Ken > > > > > > > > #include > > > > #include > > #include > > > > int main(int argc, char **argv) > > { > > int rank; > > MPI_Comm smallComm; > > int senddata[4], recvdata[4], length, offset; > > > > MPI_Init(&argc, &argv); > > > > MPI_Comm_rank(MPI_COMM_WORLD, &rank); > > > > // Split up into communicators of size 1. > > MPI_Comm_split(MPI_COMM_WORLD, rank, 0, &smallComm); > > > > // Now try to do a gatherv. > > senddata[0] = 5; senddata[1] = 6; senddata[2] = 7; senddata[3] = 8; > > recvdata[0] = 0; recvdata[1] = 0; recvdata[2] = 0; recvdata[3] = 0; > > length = 3; > > offset = 1; > > MPI_Gatherv(senddata, length, MPI_INT, > > recvdata, &length, &offset, MPI_INT, 0, smallComm); > > if (senddata[0] != recvdata[offset]) > > { > > printf("%d: %d != %d?\n", rank, senddata[0], recvdata[offset]); > > } > > else > > { > > printf("%d: Everything OK.\n", rank); > > } > > > > return 0; > > } > > > > Kenneth Moreland > > *** Sandia National Laboratories > > *** > > *** *** *** email: kmo...@sandia.gov > > ** *** ** phone: (505) 844-8919 > > *** fax: (505) 845-0833 > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ > tmat...@gmail.com || timat...@open-mpi.org > I'm a bright... http://www.the-brights.net/ > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] undefined reference to `pthread_atfork' (Lahey Fujitsu compiler AMD64)
Hi, I'm on a AMD64 box (Linux quartic.txcorp.com 2.6.19-1.2288.fc5 #1 SMP Sat Feb 10 14:59:35 EST 2007 x86_64 x86_64 x86_64 GNU/Linux) and compiled openmpi-1.2.4 using the Lahey-Fujitsu compiler (lfc). The compilation of openmpi went fine. $ ../configure --enable-mpi-f90 --enable-mpi-f77 --enable-mpi-cxx --prefix=/home/research/pletzer/local/x86_64/openmpi-1.2.4/ FC=lfc F77=lfc FCFLAGS=-O2 FFLAGS=-O2 --disable-shared --enable-static However, when compiling a test code with mpif90, I get the following error: [pletzer@quartic test]$ cat t.f90 program test implicit none include 'mpif.h' integer :: rank, size, ier call mpi_init(ier) call mpi_comm_rank(MPI_COMM_WORLD, rank, ier) call mpi_comm_size(MPI_COMM_WORLD, size, ier) print *,'rank ', rank, ' size ', size call mpi_finalize(ier) end program test [pletzer@quartic test]$ mpif90 t.f90 Encountered 0 errors, 0 warnings in file t.f90. /home/research/pletzer/local/x86_64/openmpi-1.2.4//lib/libopen-pal.a(lt1-malloc.o): In function `ptmalloc_init': malloc.c:(.text+0x4b71): undefined reference to `pthread_atfork' [pletzer@quartic test]$ I know this symbol is defined in [pletzer@quartic test]$ nm /usr/lib64/libpthread.a | grep pthread_atfork .. T pthread_atfork However linking with this library does not resolve the problem: [pletzer@quartic test]$ mpif90 t.f90 /usr/lib64/libpthread.a Encountered 0 errors, 0 warnings in file t.f90. /home/research/pletzer/local/x86_64/openmpi-1.2.4//lib/libopen-pal.a(lt1-malloc.o): In function `ptmalloc_init': malloc.c:(.text+0x4b71): undefined reference to `pthread_atfork' Thanks for your help. --Alex -- Alexander Pletzer Tech-X (p) 303 - 996 2031 (c) 609 235 6022 (f) 303 448 7756
[OMPI users] Dual ethernet & OpenMPI
In the past I configured a Linux cluster by bonding two ethernet ports together on each node (with the master having a third port of outside communication); however, recent discussions seem to say that if I have two ethernet cards OpenMPI can handle all the setup itself. My question is what address ranges should I use, that is, should both ports be on the same network range, i.e. 10.0.0.x/255.255.255.0, or should they be on separate network ranges, i.e. 10.0.0.x/255.255.255.0 and 10.0.1.x/255.255.255.0. Would I need a third ethernet card for outside communication or could one port on the master node handle both internal and external communications. Would there be any special flags to set this up or would OpenMPI detect the two paths -- obviously each port would have a different IP address if I'm not using bonding so do you just double the host list? How would I test if I have doubled my bandwidth? Michael
Re: [OMPI users] Compiling 1.2.4 using Intel Compiler 10.1.007 on Leopard
Hi Jeff,It seems that the problems are partially the compilers fault, maybe the updated compilers didn't catch all the problems filed against the last release? Why else would I need to add the "-no-multibyte-chars" flag for pretty much everything that I build with ICC? Also, its odd that I have to use /lib/cpp when using Intel ICC/ICPC whereas with GCC things just find their way correctly. Again, IFORT and GCC together seem fine. Lastly... not that I use these... but MPICH-2.1 and MPICH-1.2.7 for Myrinet built just fine.Here are the output files: config.log.tgz Description: Binary data configure.output.tgz Description: Binary data error.log.tgz Description: Binary data Warner YuenScientific Computing ConsultantApple Computeremail: wy...@apple.comTel: 408.718.2859Fax: 408.715.0133 On Dec 12, 2007, at 9:00 AM, users-requ...@open-mpi.org wrote:--Message: 1Date: Wed, 12 Dec 2007 06:50:03 -0500From: Jeff SquyresSubject: Re: [OMPI users] Problems compiling 1.2.4 using Intel Compiler 10.1.006 on LeopardTo: Open MPI Users Message-ID: <43bb0bce-e328-4d3e-ae61-84991b27f...@cisco.com>Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yesMy primary work platform is a MacBook Pro, but I don't specifically develop for OS X, so I don't have any special compilers.Sorry to ask this because I think the information was sent before, but could you send all the compile/failure information? http://www.open-mpi.org/community/help/
Re: [OMPI users] Open MPI 1.2.4 verbosity w.r.t. osc pt2pt
On Dec 11, 2007, at 9:08 AM, Lisandro Dalcin wrote: (for a nicely-formatted refresher of the issues, check out https://svn.open-mpi.org/trac/ompi/wiki/Linkers) Sorry for the late response... I've finally 'solved' this issue by using RTLD_GLOBAL for loading the Python extension module that actually calls MPI_Init(). However, I'm not completelly sure if my hackery is completelly portable. Looking briefly at the end of the link to the wiki page, you say that if the explicit linking to libmpi on componets is removed, then dlopen() has to be explicitelly called. Correct. Well, this would be a mayor headhace for me, because portability issues. Please note that I've developed mpi4py on a rather old 32 bits linux box, but it works in many different plataforms and OS's. I do really do not have the time of testing and figure out how to appropriatelly call dlopen() in platforms/OS's that I even do not have access!! Yes, this is problematic; dlopen is fun on all the various OS's... FWIW: we use the Libtool DL library for this kind of portability; OMPI itself doesn't have all the logic for the different OS loaders. Anyway, perhaps OpenMPI could provide an extension: a function call, let say 'ompi_load_dso()' or something like that, that can be called before MPI_Init() for setting-up the monster. What to you think about this? Would it be hard for you? (after much thinking...) Perhaps a better solution would be an MCA parameter: if the logical "mca_do_dlopen_hackery" (or whatever) MCA parameter is found to be true during the very beginning of MPI_INIT (down in the depths of opal_init(), actually), then we will lt_dlopen[_advise]("/libmpi"). For completeness, we'll do the corresponding dlclose in opal_finalize(). I need to think about this a bit more and run it by Brian Barrett... he's quite good at finding holes in these kinds of complex scenarios. :-) This should hypothetically allow you to do a simple putenv() before calling MPI_INIT and then the Right magic should occur. -- Jeff Squyres Cisco Systems
[OMPI users] MPI::Intracomm::Spawn and cluster configuration
Hello, I'm working on a MPI application where I'm using OpenMPI instead of MPICH. In my "master" program I call the function MPI::Intracomm::Spawn which spawns "slave" processes. It is not clear for me how to spawn the "slave" processes over the network. Currently "master" creates "slaves" on the same host. If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn over the network as expected. But now I need to spawn processes over the network from my own executable using MPI::Intracomm::Spawn, how can I achieve it? Thanks in advance for any help. Elena
Re: [OMPI users] Problems compiling 1.2.4 using Intel Compiler 10.1.006 on Leopard
My primary work platform is a MacBook Pro, but I don't specifically develop for OS X, so I don't have any special compilers. Sorry to ask this because I think the information was sent before, but could you send all the compile/failure information? http://www.open-mpi.org/community/help/ On Dec 11, 2007, at 9:32 PM, Warner Yuen wrote: Has anyone gotten Open MPI 1.2.4 to compile with the latest Intel compilers 10.1.007 and Leopard? I can get Open MPI-1.2.4 to to build with GCC + Fortran IFORT 10.1.007. But I can't get any configuration to work with Intel's 10.1.007 Compilers. The configuration completes, but the compilation fails fairly early, My compiler settings are as follows: For GCC + IFORT (This one works): export CC=/usr/bin/cc export CXX=/usr/bin/c++ export FC=/usr/bin/ifort export F90=/usr/bin/ifort export F77=/usr/bin/ifort For using all Intel compilers (The configure works but the compilation fails): export CC=/usr/bin/icc export CXX=/usr/bin/icpc export FC=/usr/bin/ifort export F90=/usr/bin/ifort export F77=/usr/bin/ifort export FFLAGS=-no-multibyte-chars export CFLAGS=-no-multibyte-chars export CXXFLAGS=-no-multibyte-chars export CCASFLAGS=-no-multibyte-chars _defined,suppress -o libasm.la asm.lo atomic-asm.lo -lutil libtool: link: ar cru .libs/libasm.a .libs/asm.o .libs/atomic-asm.o ar: .libs/atomic-asm.o: No such file or directory make[2]: *** [libasm.la] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 Warner Yuen Scientific Computing Consultant Apple Computer email: wy...@apple.com Tel: 408.718.2859 Fax: 408.715.0133 On Nov 22, 2007, at 2:26 AM, users-requ...@open-mpi.org wrote: -- Message: 2 Date: Wed, 21 Nov 2007 15:15:00 -0500 From: Mark Dobossy Subject: Re: [OMPI users] Problems compiling 1.2.4 using Intel Compiler10.1.006 on Leopard To: Open MPI Users Message-ID: <99ca0551-9bf4-47c0-85c2-6b2126a83...@princeton.edu> Content-Type: text/plain; charset=US-ASCII; format=flowed Thanks for the suggestion Jeff. Unfortunately, that didn't fix the issue. -Mark On Nov 21, 2007, at 7:55 AM, Jeff Squyres wrote: Can you try also adding CCASFLAGS=-no-multibyte-chars? ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] error with Vprotocol pessimist
I could reproduce and fix the bug. It will be corrected in trunk as soon as the svn is online again. Thanks for reporting the problem. Aurelien Le 11 déc. 07 à 15:02, Aurelien Bouteiller a écrit : I cannot reproduce the error. Please make sure you have the lib/ openmpi/mca_pml_v.so file in your build. If you don't, maybe you forgot to run autogen.sh at the root of the trunk when you removed .ompi_ignore. If this does not fix the problem, please let me know your command line options to mpirun. Aurelien Le 11 déc. 07 à 14:36, Aurelien Bouteiller a écrit : Mmm, I'll investigate this today. Aurelien Le 11 déc. 07 à 08:46, Thomas Ropars a écrit : Hi, I've tried to test the message logging component vprotocol pessimist. (svn checkout revision 16926) When I run an mpi application, I get the following error : mca: base: component_find: unable to open vprotocol pessimist: /local/openmpi/lib/openmpi/mca_vprotocol_pessimist.so: undefined symbol: pml_v_output (ignored) Regards Thomas ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dr. Aurelien Bouteiller, Sr. Research Associate Innovative Computing Laboratory - MPI group +1 865 974 6321 1122 Volunteer Boulevard Claxton Education Building Suite 350 Knoxville, TN 37996 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users