Can your nodes see the openmpi libraries and executables? I have the /usr/local and /opt from the master node mounted on the compute nodes, in addition to having the LD_LIBRARY_PATH defined correctly. In your case the nodes must be able to see /home/rchaud/openmpi-1.2.6 in order to get the libraries and executables, so this directory must be mounted on the nodes. You don't want to copy all this stuff to the nodes in a bproc environment, since it would eat away at your ram.
Daniel On Wed, Nov 05, 2008 at 12:44:03PM -0600, Rima Chaudhuri wrote: > Thanks for all your help Ralph and Sean!! > I changed the machinefile to just containing the node numbers. I added > the env variable NODES in my .bash_profile and .bashrc. > As per Sean's suggestion I added the $LD_LIBRARY_PATH (shared lib path > which the openmpi lib directory path) and the $AMBERHOME/lib as 2 of > the libraries' path in the config file of beowulf. I also checked by > bpsh from one of the compute nodes whether it can see the executables > which is in $AMBERHOME/exe and the mpirun(OMPI): > I get the following error message: > > [rchaud@helios amber10]$ ./step1 > -------------------------------------------------------------------------- > A daemon (pid 25319) launched by the bproc PLS component on node 2 died > unexpectedly on signal 13 so we are aborting. > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -------------------------------------------------------------------------- > [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file > pls_bproc.c at line 717 > [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file > pls_bproc.c at line 1164 > [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file > rmgr_urm.c at line 462 > [helios.structure.uic.edu:25317] mpirun: spawn failed with errno=-1 > > > I tested to see if the compute nodes could see the master by the > following commands: > > [rchaud@helios amber10]$ bpsh 2 echo $LD_LIBRARY_PATH > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib > [rchaud@helios amber10]$ bpsh 2 echo $AMBERHOME > /home/rchaud/Amber10_openmpi/amber10 > [rchaud@helios amber10]$ bpsh 2 ls -al > total 11064 > drwxr-xr-x 11 rchaud 0 4096 Nov 5 11:33 . > drwxr-xr-x 3 rchaud 100 4096 Oct 20 17:21 .. > -rw-r--r-- 1 128 53 1201 Jul 10 17:08 Changelog_at > -rw-rw-r-- 1 128 53 25975 Feb 28 2008 > GNU_Lesser_Public_License > -rw-rw---- 1 128 53 3232 Mar 30 2008 INSTALL > -rw-rw-r-- 1 128 53 20072 Feb 11 2008 LICENSE_at > -rw-r--r-- 1 0 0 1814241 Oct 31 13:32 PLP_617_xtal_nolig.crd > -rw-r--r-- 1 0 0 8722770 Oct 31 13:31 PLP_617_xtal_nolig.top > -rw-rw-r-- 1 128 53 1104 Mar 18 2008 README > -rw-r--r-- 1 128 53 1783 Jun 23 19:43 README_at > drwxrwxr-x 10 128 53 4096 Oct 20 17:23 benchmarks > drwxr-xr-x 2 0 0 4096 Oct 20 18:21 bin > -rw-r--r-- 1 0 0 642491 Oct 20 17:51 bugfix.all > drwxr-xr-x 13 0 0 4096 Oct 20 17:37 dat > drwxr-xr-x 3 0 0 4096 Oct 20 17:23 doc > drwxrwxr-x 9 128 53 4096 Oct 20 17:23 examples > lrwxrwxrwx 1 0 0 3 Oct 20 17:34 exe -> bin > drwxr-xr-x 2 0 0 4096 Oct 20 17:35 include > drwxr-xr-x 2 0 0 4096 Oct 20 17:36 lib > -rw-r--r-- 1 rchaud 100 30 Nov 5 11:33 machinefile > -rw-r--r-- 1 rchaud 100 161 Nov 5 12:11 min > drwxrwxr-x 40 128 53 4096 Oct 20 17:50 src > -rwxr-xr-x 1 rchaud 100 376 Nov 3 16:41 step1 > drwxrwxr-x 114 128 53 4096 Oct 20 17:23 test > > [rchaud@helios amber10]$ bpsh 2 which mpirun > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/bin/mpirun > > The $LD_LIBRARY_PATH seems to be defined correctly, but then why is it > not being read? > > thanks > > On Wed, Nov 5, 2008 at 11:08 AM, <users-requ...@open-mpi.org> wrote: > > Send users mailing list submissions to > > us...@open-mpi.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > or, via email, send a message with subject or body 'help' to > > users-requ...@open-mpi.org > > > > You can reach the person managing the list at > > users-ow...@open-mpi.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of users digest..." > > > > > > Today's Topics: > > > > 1. Re: Beowulf cluster and openmpi (Kelley, Sean) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Wed, 5 Nov 2008 12:08:13 -0500 > > From: "Kelley, Sean" <sean.kel...@solers.com> > > Subject: Re: [OMPI users] Beowulf cluster and openmpi > > To: "Open MPI Users" <us...@open-mpi.org> > > Message-ID: > > <A804E989DCC5234FBA6C4E62B939978F2EB3D5@ava-es5.solers.local> > > Content-Type: text/plain; charset="us-ascii" > > > > I would suggest making sure that the /etc/beowulf/config file has a > > "libraries" line for every directory where required shared libraries > > (application and mpi) are located. > > > > Also, make sure that the filesystems containing the executables and > > shared libraries are accessible from the compute nodes. > > > > Sean > > > > -----Original Message----- > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > > Behalf Of Rima Chaudhuri > > Sent: Monday, November 03, 2008 5:50 PM > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] Beowulf cluster and openmpi > > > > I added the option for -hostfile machinefile where the machinefile is a > > file with the IP of the nodes: > > #host names > > 192.168.0.100 slots=2 > > 192.168.0.101 slots=2 > > 192.168.0.102 slots=2 > > 192.168.0.103 slots=2 > > 192.168.0.104 slots=2 > > 192.168.0.105 slots=2 > > 192.168.0.106 slots=2 > > 192.168.0.107 slots=2 > > 192.168.0.108 slots=2 > > 192.168.0.109 slots=2 > > > > > > [rchaud@helios amber10]$ ./step1 > > ------------------------------------------------------------------------ > > -- > > A daemon (pid 29837) launched by the bproc PLS component on node 192 > > died unexpectedly so we are aborting. > > > > This may be because the daemon was unable to find all the needed shared > > libraries on the remote node. You may set your LD_LIBRARY_PATH to have > > the location of the shared libraries on the remote nodes and this will > > automatically be forwarded to the remote nodes. > > ------------------------------------------------------------------------ > > -- > > [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file > > pls_bproc.c at line 717 [helios.structure.uic.edu:29836] [0,0,0] > > ORTE_ERROR_LOG: Error in file pls_bproc.c at line 1164 > > [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file > > rmgr_urm.c at line 462 [helios.structure.uic.edu:29836] mpirun: spawn > > failed with errno=-1 > > > > I used bpsh to see if the master and one of the nodes n8 could see the > > $LD_LIBRARY_PATH, and it does.. > > > > [rchaud@helios amber10]$ echo $LD_LIBRARY_PATH > > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib > > > > [rchaud@helios amber10]$ bpsh n8 echo $LD_LIBRARY_PATH > > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib > > > > thanks! > > > > > > On Mon, Nov 3, 2008 at 3:14 PM, <users-requ...@open-mpi.org> wrote: > >> Send users mailing list submissions to > >> us...@open-mpi.org > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> or, via email, send a message with subject or body 'help' to > >> users-requ...@open-mpi.org > >> > >> You can reach the person managing the list at > >> users-ow...@open-mpi.org > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of users digest..." > >> > >> > >> Today's Topics: > >> > >> 1. Re: Problems installing in Cygwin - Problem with GCC 3.4.4 > >> (Jeff Squyres) > >> 2. switch from mpich2 to openMPI <newbie question> (PattiMichelle) > >> 3. Re: users Digest, Vol 1055, Issue 2 (Ralph Castain) > >> > >> > >> ---------------------------------------------------------------------- > >> > >> Message: 1 > >> Date: Mon, 3 Nov 2008 15:52:22 -0500 > >> From: Jeff Squyres <jsquy...@cisco.com> > >> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem with > >> GCC 3.4.4 > >> To: "Gustavo Seabra" <gustavo.sea...@gmail.com> > >> Cc: Open MPI Users <us...@open-mpi.org> > >> Message-ID: <a016b8c4-510b-4fd2-ad3b-a1b644050...@cisco.com> > >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > >> > >> On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote: > >> > >>>> For your fortran issue, the Fortran 90 interface needs the Fortran > >>>> 77 interface. So you need to supply an F77 as well (the output from > > > >>>> configure should indicate that the F90 interface was disabled > >>>> because the F77 interface was disabled). > >>> > >>> Is that what you mean (see below)? > >> > >> Ah yes -- that's another reason the f90 interface could be disabled: > >> if configure detects that the f77 and f90 compilers are not link- > >> compatible. > >> > >>> I thought the g95 compiler could > >>> deal with F77 as well as F95... If so, could I just pass F77='g95'? > >> > >> That would probably work (F77=g95). I don't know the g95 compiler at > >> all, so I don't know if it also accepts Fortran-77-style codes. But > >> if it does, then you're set. Otherwise, specify a different F77 > >> compiler that is link compatible with g95 and you should be good. > >>>>> I looked in some places in the OpenMPI code, but I couldn't find > >>>>> "max" being redefined anywhere, but I may be looking in the wrong > >>>>> places. Anyways, the only way of found of compiling OpenMPI was a > >>>>> very ugly hack: I have to go into those files and remove the > >>>>> "std::" > >>>>> before > >>>>> the "max". With that, it all compiled cleanly. > >>>> > >>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use > >>>> std::max. > >>>> What areas did you find that you needed to change? > >>> > >>> These files are part of the standard C++ headers. In my case, they > >>> sit in: > >>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits > >> > >> Ah, I see. > >> > >>> In principle, the problems that comes from those files would mean > >>> that the OpenMPI source has some macro redefining max, but that's > >>> what I could not find :-( > >> > >> Gotcha. I don't think we are defining a "max" macro anywhere in the > >> ompi_info source or related header files. :-( > >> > >>>> No. We don't really maintain the "make check" stuff too well. > >>> > >>> Oh well... What do you use for testing the implementation? > >> > >> > >> We have a whole pile of MPI tests in a private SVN repository. The > >> repository is only private because it contains a lot of other people's > > > >> [public] MPI test suites and benchmarks, and we never looked into > >> redistribution rights for their software. There's nothing really > >> secret about it -- we just haven't bothered to look into the IP > >> issues. :-) > >> > >> We use the MPI Testing Tool (MTT) for nightly regression across the > >> community: > >> > >> http://www.open-mpi.org/mtt/ > >> > >> We have weekday and weekend testing schedules. M-Th we do nightly > >> tests; F-Mon morning, we do a long weekend schedule. This weekend, > >> for example, we ran about 675k regression tests: > >> > >> http://www.open-mpi.org/mtt/index.php?do_redir=875 > >> > >> -- > >> Jeff Squyres > >> Cisco Systems > >> > >> > >> > >> ------------------------------ > >> > >> Message: 2 > >> Date: Mon, 03 Nov 2008 12:59:59 -0800 > >> From: PattiMichelle <mic...@earthlink.net> > >> Subject: [OMPI users] switch from mpich2 to openMPI <newbie question> > >> To: us...@open-mpi.org, patti.sheaf...@aero.org > >> Message-ID: <490f664f.4000...@earthlink.net> > >> Content-Type: text/plain; charset="iso-8859-1" > >> > >> I just found out I need to switch from mpich2 to openMPI for some code > > > >> I'm running. I noticed that it's available in an openSuSE repo (I'm > >> using openSuSE 11.0 x86_64 on a TYAN 32-processor Opteron 8000 > >> system), but when I was using mpich2 I seemed to have better luck > >> compiling it from code. This is the line I used: > >> > >> # $ F77=/path/to/g95 F90=/path/to/g95 ./configure > >> --prefix=/some/place/mpich2-install > >> > >> But usually I left the "--prefix=" off and just let it install to it's > > > >> default... which is /usr/local/bin and that's nice because it's > >> already in the PATH and very usable. I guess my question is whether > >> or not the defaults and configuration syntax have stayed the same in > >> openMPI. I also could use a "quickstart" guide for a non-programming > >> user (e.g., I think I have to start a daemon before running > > parallelized programs). > >> > >> THANKS!!! > >> PattiM. > >> -------------- next part -------------- HTML attachment scrubbed and > >> removed > >> > >> ------------------------------ > >> > >> Message: 3 > >> Date: Mon, 3 Nov 2008 14:14:36 -0700 > >> From: Ralph Castain <r...@lanl.gov> > >> Subject: Re: [OMPI users] users Digest, Vol 1055, Issue 2 > >> To: Open MPI Users <us...@open-mpi.org> > >> Message-ID: <2fbdf4dc-b2df-4486-a644-0f18c96e8...@lanl.gov> > >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > >> > >> The problem is that you didn't specify or allocate any nodes for the > >> job. At the least, you need to tell us what nodes to use via a > > hostfile. > >> > >> Alternatively, are you using a resource manager to assign the nodes? > >> OMPI didn't see anything from one, but it could be that we just didn't > > > >> see the right envar. > >> > >> Ralph > >> > >> On Nov 3, 2008, at 1:39 PM, Rima Chaudhuri wrote: > >> > >>> Thanks a lot Ralph! > >>> I corrected the no_local to nolocal and now when I try to execute the > > > >>> script step1 (pls find it attached) [rchaud@helios amber10]$ ./step1 > >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Not > >>> available in file ras_bjs.c at line 247 > >>> --------------------------------------------------------------------- > >>> ----- There are no available nodes allocated to this job. This could > >>> be because no nodes were found or all the available nodes were > >>> already used. > >>> > >>> Note that since the -nolocal option was given no processes can be > >>> launched on the local node. > >>> --------------------------------------------------------------------- > >>> ----- [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: > >>> Temporarily out of resource in file base/rmaps_base_support_fns.c at > >>> line 168 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: > >>> Temporarily out of resource in file rmaps_rr.c at line 402 > >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily > >>> out of resource in file base/rmaps_base_map_job.c at line 210 > >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily > >>> out of resource in file rmgr_urm.c at line 372 > >>> [helios.structure.uic.edu:16335] mpirun: spawn failed with errno=-3 > >>> > >>> > >>> > >>> If I use the script without the --nolocal option, I get the following > > > >>> error: > >>> [helios.structure.uic.edu:20708] [0,0,0] ORTE_ERROR_LOG: Not > >>> available in file ras_bjs.c at line 247 > >>> > >>> > >>> thanks, > >>> > >>> > >>> On Mon, Nov 3, 2008 at 2:04 PM, <users-requ...@open-mpi.org> wrote: > >>>> Send users mailing list submissions to > >>>> us...@open-mpi.org > >>>> > >>>> To subscribe or unsubscribe via the World Wide Web, visit > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> or, via email, send a message with subject or body 'help' to > >>>> users-requ...@open-mpi.org > >>>> > >>>> You can reach the person managing the list at > >>>> users-ow...@open-mpi.org > >>>> > >>>> When replying, please edit your Subject line so it is more specific > >>>> than "Re: Contents of users digest..." > >>>> > >>>> > >>>> Today's Topics: > >>>> > >>>> 1. Scyld Beowulf and openmpi (Rima Chaudhuri) 2. Re: Scyld Beowulf > > > >>>> and openmpi (Ralph Castain) 3. Problems installing in Cygwin - > >>>> Problem with GCC 3.4.4 > >>>> (Gustavo Seabra) > >>>> 4. Re: MPI + Mixed language coding(Fortran90 + C++) (Jeff Squyres) > >>>> 5. Re: Problems installing in Cygwin - Problem with GCC 3.4.4 > >>>> (Jeff Squyres) > >>>> > >>>> > >>>> -------------------------------------------------------------------- > >>>> -- > >>>> > >>>> Message: 1 > >>>> Date: Mon, 3 Nov 2008 11:30:01 -0600 > >>>> From: "Rima Chaudhuri" <rima.chaudh...@gmail.com> > >>>> Subject: [OMPI users] Scyld Beowulf and openmpi > >>>> To: us...@open-mpi.org > >>>> Message-ID: > >>>> <7503b17d0811030930i13acb974kc627983a1d481...@mail.gmail.com> > >>>> Content-Type: text/plain; charset=ISO-8859-1 > >>>> > >>>> Hello! > >>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our > >>>> x86_64 linux scyld beowulf cluster inorder to make it run with > >>>> amber10 MD simulation package. > >>>> > >>>> The nodes can see the home directory i.e. a bpsh to the nodes works > >>>> fine and lists all the files in the home directory where I have both > > > >>>> openmpi and amber10 installed. > >>>> However if I try to run: > >>>> > >>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI > >>>> ........ > >>>> > >>>> I get the following error: > >>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 > >>>> -------------------------------------------------------------------- > >>>> ------ Failed to find the following executable: > >>>> > >>>> Host: helios.structure.uic.edu > >>>> Executable: -o > >>>> > >>>> Cannot continue. > >>>> -------------------------------------------------------------------- > >>>> ------ [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not > >>>> found in file rmgr_urm.c at line 462 > >>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13 > >>>> > >>>> any cues? > >>>> > >>>> > >>>> -- > >>>> -Rima > >>>> > >>>> > >>>> ------------------------------ > >>>> > >>>> Message: 2 > >>>> Date: Mon, 3 Nov 2008 12:08:36 -0700 > >>>> From: Ralph Castain <r...@lanl.gov> > >>>> Subject: Re: [OMPI users] Scyld Beowulf and openmpi > >>>> To: Open MPI Users <us...@open-mpi.org> > >>>> Message-ID: <91044a7e-ada5-4b94-aa11-b3c1d9843...@lanl.gov> > >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > >>>> > >>>> For starters, there is no "-no_local" option to mpirun. You might > >>>> want to look at mpirun --help, or man mpirun. > >>>> > >>>> I suspect the option you wanted was --nolocal. Note that --nolocal > >>>> does not take an argument. > >>>> > >>>> Mpirun is confused by the incorrect option and looking for an > >>>> incorrectly named executable. > >>>> Ralph > >>>> > >>>> > >>>> On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote: > >>>> > >>>>> Hello! > >>>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our > >>>>> x86_64 linux scyld beowulf cluster inorder to make it run with > >>>>> amber10 MD simulation package. > >>>>> > >>>>> The nodes can see the home directory i.e. a bpsh to the nodes works > > > >>>>> fine and lists all the files in the home directory where I have > >>>>> both openmpi and amber10 installed. > >>>>> However if I try to run: > >>>>> > >>>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI > >>>>> ........ > >>>>> > >>>>> I get the following error: > >>>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 > >>>>> ------------------------------------------------------------------- > >>>>> ------- Failed to find the following executable: > >>>>> > >>>>> Host: helios.structure.uic.edu > >>>>> Executable: -o > >>>>> > >>>>> Cannot continue. > >>>>> ------------------------------------------------------------------- > >>>>> ------- [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: > >>>>> Not found in file rmgr_urm.c at line 462 > >>>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with > >>>>> errno=-13 > >>>>> > >>>>> any cues? > >>>>> > >>>>> > >>>>> -- > >>>>> -Rima > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> > >>>> > >>>> ------------------------------ > >>>> > >>>> Message: 3 > >>>> Date: Mon, 3 Nov 2008 14:53:55 -0500 > >>>> From: "Gustavo Seabra" <gustavo.sea...@gmail.com> > >>>> Subject: [OMPI users] Problems installing in Cygwin - Problem with > >>>> GCC > >>>> 3.4.4 > >>>> To: "Open MPI Users" <us...@open-mpi.org> > >>>> Message-ID: > >>>> <f79359b60811031153l5591e0f8j49a7e4d9fb02e...@mail.gmail.com> > >>>> Content-Type: text/plain; charset=ISO-8859-1 > >>>> > >>>> Hi everyone, > >>>> > >>>> Here's a "progress report"... more questions in the end :-) > >>>> > >>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using the > >>>> following configure command: > >>>> > >>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \ > >>>> --with-mpi-param_check=always --with-threads=posix \ > >>>> --enable-mpi-threads --disable-io-romio \ > >>>> --enable-mca-no- > >>>> build=memory_mallopt,maffinity,paffinity \ > >>>> --enable-contrib-no-build=vt \ > >>>> FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++ > >>>> > >>>> I then had a very weird error during compilation of > >>>> ompi/tools/ompi_info/params.cc. (See below). > >>>> > >>>> The lines causing the compilation errors are: > >>>> > >>>> vector.tcc:307: const size_type __len = __old_size + > >>>> std::max(__old_size, __n); > >>>> vector.tcc:384: const size_type __len = __old_size + > >>>> std::max(__old_size, __n); > >>>> stl_bvector.h:522: const size_type __len = size() + > >>>> std::max(size(), __n); > >>>> stl_bvector.h:823: const size_type __len = size() + > >>>> std::max(size(), __n); > >>>> > >>>> (Notice that those are from the standard gcc libraries.) > >>>> > >>>> After googling it for a while, I could find that this error is > >>>> caused because, at come point, the source code being compiled > >>>> redefined the "max" function with a macro, g++ cannot recognize the > >>>> "std::max" that happens in those lines and only "sees" a (...), thus > > > >>>> printing that cryptic complaint. > >>>> > >>>> I looked in some places in the OpenMPI code, but I couldn't find > >>>> "max" being redefined anywhere, but I may be looking in the wrong > >>>> places. Anyways, the only way of found of compiling OpenMPI was a > >>>> very ugly hack: I have to go into those files and remove the "std::" > >>>> before > >>>> the "max". With that, it all compiled cleanly. > >>>> > >>>> I did try running the tests in the 'tests' directory (with 'make > >>>> check'), and I didn't get any alarming message, except that in some > >>>> cases (class, threads, peruse) it printed "All 0 tests passed". I > >>>> got and "All (n) tests passed" (n>0) for asm and datatype. > >>>> > >>>> Can anybody comment on the meaning of those test results? Should I > >>>> be alarmed with the "All 0 tests passed" messages? > >>>> > >>>> Finally, in the absence of big red flags (that I noticed), I went > >>>> ahead and tried to compile my program. However, as soon as > >>>> compilation starts, I get the following: > >>>> > >>>> /local/openmpi/openmpi-1.3b1/bin/mpif90 -c -O3 -fno-second- > >>>> underscore -ffree-form -o constants.o _constants.f > >>>> -------------------------------------------------------------------- > >>>> ------ Unfortunately, this installation of Open MPI was not compiled > > > >>>> with Fortran 90 support. As such, the mpif90 compiler is > >>>> non-functional. > >>>> -------------------------------------------------------------------- > >>>> ------ > >>>> make[1]: *** [constants.o] Error 1 > >>>> make[1]: Leaving directory `/home/seabra/local/amber11/src/sander' > >>>> make: *** [parallel] Error 2 > >>>> > >>>> Notice that I compiled OpenMPI with g95, so there *should* be > >>>> Fortran95 support... Any ideas on what could be going wrong? > >>>> > >>>> Thank you very much, > >>>> Gustavo. > >>>> > >>>> ====================================== > >>>> Error in the compilation of params.cc > >>>> ====================================== > >>>> $ g++ -DHAVE_CONFIG_H -I. -I../../../opal/include > >>>> -I../../../orte/include -I../../../ompi/include > >>>> -I../../../opal/mca/paffinity/linux/plpa/src/libplpa > >>>> -DOMPI_CONFIGURE_USER="\"seabra\"" -DOMPI_CONFIGURE_HOST="\"ACS02\"" > >>>> -DOMPI_CONFIGURE_DATE="\"Sat Nov 1 20:44:32 EDT 2008\"" > >>>> -DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" > >>>> -DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O3 -DNDEBUG > >>>> -finline-functions -fno-strict-aliasing \"" > >>>> -DOMPI_BUILD_CPPFLAGS="\"-I../../.. -D_REENTRANT\"" > >>>> -DOMPI_BUILD_CXXFLAGS="\"-O3 -DNDEBUG -finline-functions \"" > >>>> -DOMPI_BUILD_CXXCPPFLAGS="\"-I../../.. -D_REENTRANT\"" > >>>> -DOMPI_BUILD_FFLAGS="\"-O0 -fno-second-underscore\"" > >>>> -DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\"-export-dynamic > >>>> \"" -DOMPI_BUILD_LIBS="\"-lutil \"" > >>>> -DOMPI_CC_ABSOLUTE="\"/usr/bin/gcc\"" > >>>> -DOMPI_CXX_ABSOLUTE="\"/usr/bin/g++\"" > >>>> -DOMPI_F77_ABSOLUTE="\"/usr/bin/g77\"" > >>>> -DOMPI_F90_ABSOLUTE="\"/usr/local/bin/g95\"" > >>>> -DOMPI_F90_BUILD_SIZE="\"small\"" -I../../.. -D_REENTRANT -O3 > >>>> -DNDEBUG -finline-functions -MT param.o -MD -MP -MF $depbase.Tpo -c > > > >>>> -o param.o param.cc In file included from > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/ > >>>> vector:72, > >>>> from ../../../ompi/tools/ompi_info/ompi_info.h:24, > >>>> from param.cc:43: > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In > > > >>>> member function `void std::vector<bool, > >>>> _Alloc>::_M_insert_range(std::_Bit_iterator, _ForwardIterator, > >>>> _ForwardIterator, std::forward_iterator_tag)': > >>>> > > /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:522: > >>>> error: expected unqualified-id before '(' token > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In > > > >>>> member function `void std::vector<bool, > >>>> _Alloc>::_M_fill_insert(std::_Bit_iterator, size_t, bool)': > >>>> > > /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:823: > >>>> error: expected unqualified-id before '(' token In file included > >>>> from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/ > >>>> vector:75, > >>>> from ../../../ompi/tools/ompi_info/ompi_info.h:24, > >>>> from param.cc:43: > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In > >>>> member function `void std::vector<_Tp, > >>>> _Alloc>::_M_fill_insert(__gnu_cxx::__normal_iterator<typename > >>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, size_t, const _Tp&)': > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:307: > >>>> error: expected unqualified-id before '(' token > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In > >>>> member function `void std::vector<_Tp, > >>>> _Alloc>::_M_range_insert(__gnu_cxx::__normal_iterator<typename > >>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, _ForwardIterator, > >>>> _ForwardIterator, std::forward_iterator_tag)': > >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:384: > >>>> error: expected unqualified-id before '(' token > >>>> > >>>> > >>>> -- > >>>> Gustavo Seabra > >>>> Postdoctoral Associate > >>>> Quantum Theory Project - University of Florida Gainesville - Florida > > > >>>> - USA > >>>> > >>>> > >>>> ------------------------------ > >>>> > >>>> Message: 4 > >>>> Date: Mon, 3 Nov 2008 14:54:25 -0500 > >>>> From: Jeff Squyres <jsquy...@cisco.com> > >>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C+ > >>>> +) > >>>> To: Open MPI Users <us...@open-mpi.org> > >>>> Message-ID: <45698801-0857-466f-a19d-c529f72d4...@cisco.com> > >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > >>>> > >>>> Can you replicate the scenario in smaller / different cases? > >>>> > >>>> - write a sample plugin in C instead of C++ > >>>> - write a non-MPI Fortran application that loads your C++ > >>>> application > >>>> - ...? > >>>> > >>>> In short, *MPI* shouldn't be interfering with Fortran/C++ common > >>>> blocks. Try taking MPI out of the picture and see if that makes the > > > >>>> problem go away. > >>>> > >>>> Those are pretty much shots in the dark, but I don't know where to > >>>> go, either -- try random things until you find what you want. > >>>> > >>>> > >>>> On Nov 3, 2008, at 3:51 AM, Rajesh Ramaya wrote: > >>>> > >>>>> Helllo Jeff, Gustavo, Mi > >>>>> Thank for the advice. I am familiar with the difference in the > >>>>> compiler code generation for C, C++ & FORTRAN. I even tried to look > > > >>>>> at some of the common block symbols. The name of the symbol remains > > > >>>>> the same. The only difference that I observe is in FORTRAN compiled > > > >>>>> *.o 0000000000515bc0 B aux7loc_ and the C++ compiled code U > >>>>> aux7loc_ the memory is not allocated as it has been declared as > >>>>> extern in C++. When the executable loads the shared library it > >>>>> finds all the undefined symbols. Atleast if it did not manage to > >>>>> find a single symbol it prints undefined symbol error. > >>>>> I am completely stuck up and do not know how to continue further. > >>>>> > >>>>> Thanks, > >>>>> Rajesh > >>>>> > >>>>> From: users-boun...@open-mpi.org > >>>>> [mailto:users-boun...@open-mpi.org] > >>>>> On Behalf Of Mi Yan > >>>>> Sent: samedi 1 novembre 2008 23:26 > >>>>> To: Open MPI Users > >>>>> Cc: 'Open MPI Users'; users-boun...@open-mpi.org > >>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C > >>>>> ++) > >>>>> > >>>>> So your tests show: > >>>>> 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works. > >>>>> 2. "Shared library in C++ + MPI executable in FORTRAN " does not > >>>>> work. > >>>>> > >>>>> It seems to me that the symbols in C library are not really > >>>>> recognized by FORTRAN executable as you thought. What compilers did > > > >>>>> yo use to built OpenMPI? > >>>>> > >>>>> Different compiler has different convention to handle symbols. E.g. > >>>>> if there is a variable "var_foo" in your FORTRAN code, some FORTRN > >>>>> compiler will save "var_foo_" in the object file by default; if you > > > >>>>> want to access "var_foo" in C code, you actually need to refer > >>>>> "var_foo_" in C code. If you define "var_foo" in a module in the > >>>>> FORTAN compiler, some FORTRAN compiler may append the module name > >>>>> to "var_foo". > >>>>> So I suggest to check the symbols in the object files generated by > >>>>> your FORTAN and C compiler to see the difference. > >>>>> > >>>>> Mi > >>>>> <image001.gif>"Rajesh Ramaya" <rajesh.ram...@e-xstream.com> > >>>>> > >>>>> > >>>>> "Rajesh Ramaya" <rajesh.ram...@e-xstream.com> Sent by: > >>>>> users-boun...@open-mpi.org > >>>>> 10/31/2008 03:07 PM > >>>>> > >>>>> Please respond to > >>>>> Open MPI Users <us...@open-mpi.org> <image002.gif> To > >>>>> <image003.gif> "'Open MPI Users'" <us...@open-mpi.org>, "'Jeff > >>>>> Squyres'" <jsquy...@cisco.com > >>>>>> > >>>>> <image002.gif> > >>>>> cc > >>>>> <image003.gif> > >>>>> <image002.gif> > >>>>> Subject > >>>>> <image003.gif> > >>>>> Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) > >>>>> > >>>>> <image003.gif> > >>>>> <image003.gif> > >>>>> > >>>>> Hello Jeff Squyres, > >>>>> Thank you very much for the immediate reply. I am able to > >>>>> successfully access the data from the common block but the values > >>>>> are zero. In my algorithm I even update a common block but the > >>>>> update made by the shared library is not taken in to account by the > > > >>>>> executable. Can you please be very specific how to make the > >>>>> parallel algorithm aware of the data? > >>>>> Actually I am > >>>>> not writing any MPI code inside? It's the executable (third party > >>>>> software) > >>>>> who does that part. All that I am doing is to compile my code with > >>>>> MPI c compiler and add it in the LD_LIBIRARY_PATH. > >>>>> In fact I did a simple test by creating a shared library using a > >>>>> FORTRAN code and the update made to the common block is taken in to > > > >>>>> account by the executable. Is there any flag or pragma that need to > > > >>>>> be activated for mixed language MPI? > >>>>> Thank you once again for the reply. > >>>>> > >>>>> Rajesh > >>>>> > >>>>> -----Original Message----- > >>>>> From: users-boun...@open-mpi.org > >>>>> [mailto:users-boun...@open-mpi.org] > >>>>> On > >>>>> Behalf Of Jeff Squyres > >>>>> Sent: vendredi 31 octobre 2008 18:53 > >>>>> To: Open MPI Users > >>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C > >>>>> ++) > >>>>> > >>>>> On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote: > >>>>> > >>>>>> I am completely new to MPI. I have a basic question concerning > >>>>>> MPI and mixed language coding. I hope any of you could help me > > out. > >>>>>> Is it possible to access FORTRAN common blocks in C++ in a MPI > >>>>>> compiled code. It works without MPI but as soon I switch to MPI > >>>>>> the access of common block does not work anymore. > >>>>>> I have a Linux MPI executable which loads a shared library at > >>>>>> runtime and resolves all undefined symbols etc The shared library > > > >>>>>> is written in C++ and the MPI executable in written in FORTRAN. > >>>>>> Some > >>>>>> of the input that the shared library looking for are in the > >>>>>> Fortran common blocks. As I access those common blocks during > >>>>>> runtime the values are not initialized. I would like to know if > >>>>>> what I am doing is possible ?I hope that my problem is clear...... > >>>>> > >>>>> > >>>>> Generally, MPI should not get in the way of sharing common blocks > >>>>> between Fortran and C/C++. Indeed, in Open MPI itself, we share a > >>>>> few common blocks between Fortran and the main C Open MPI > >>>>> implementation. > >>>>> > >>>>> What is the exact symptom that you are seeing? Is the application > >>>>> failing to resolve symbols at run-time, possibly indicating that > >>>>> something hasn't instantiated a common block? Or are you able to > >>>>> successfully access the data from the common block, but it doesn't > >>>>> have the values you expect (e.g., perhaps you're seeing all zeros)? > >>>>> > >>>>> If the former, you might want to check your build procedure. You > >>>>> *should* be able to simply replace your C++ / F90 compilers with > >>>>> mpicxx and mpif90, respectively, and be able to build an MPI > >>>>> version of your app. If the latter, you might need to make your > >>>>> parallel algorithm aware of what data is available in which MPI > >>>>> process -- perhaps not all the data is filled in on each MPI > > process...? > >>>>> > >>>>> -- > >>>>> Jeff Squyres > >>>>> Cisco Systems > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> > >>>> -- > >>>> Jeff Squyres > >>>> Cisco Systems > >>>> > >>>> > >>>> > >>>> ------------------------------ > >>>> > >>>> Message: 5 > >>>> Date: Mon, 3 Nov 2008 15:04:47 -0500 > >>>> From: Jeff Squyres <jsquy...@cisco.com> > >>>> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem > >>>> with > >>>> GCC 3.4.4 > >>>> To: Open MPI Users <us...@open-mpi.org> > >>>> Message-ID: <8e364b51-6726-4533-ade2-aea266380...@cisco.com> > >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > >>>> > >>>> On Nov 3, 2008, at 2:53 PM, Gustavo Seabra wrote: > >>>> > >>>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using the > > > >>>>> following configure command: > >>>>> > >>>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \ > >>>>> --with-mpi-param_check=always --with-threads=posix \ > >>>>> --enable-mpi-threads --disable-io-romio \ > >>>>> --enable-mca-no- > >>>>> build=memory_mallopt,maffinity,paffinity \ > >>>>> --enable-contrib-no-build=vt \ > >>>>> FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++ > >>>> > >>>> For your fortran issue, the Fortran 90 interface needs the Fortran > >>>> 77 interface. So you need to supply an F77 as well (the output from > > > >>>> configure should indicate that the F90 interface was disabled > >>>> because the F77 interface was disabled). > >>>> > >>>>> I then had a very weird error during compilation of > >>>>> ompi/tools/ompi_info/params.cc. (See below). > >>>>> > >>>>> The lines causing the compilation errors are: > >>>>> > >>>>> vector.tcc:307: const size_type __len = __old_size + > >>>>> std::max(__old_size, __n); > >>>>> vector.tcc:384: const size_type __len = __old_size + > >>>>> std::max(__old_size, __n); > >>>>> stl_bvector.h:522: const size_type __len = size() + > >>>>> std::max(size(), __n); > >>>>> stl_bvector.h:823: const size_type __len = size() + > >>>>> std::max(size(), __n); > >>>>> > >>>>> (Notice that those are from the standard gcc libraries.) > >>>>> > >>>>> After googling it for a while, I could find that this error is > >>>>> caused because, at come point, the source code being compiled > >>>>> redefined the "max" function with a macro, g++ cannot recognize the > > > >>>>> "std::max" > >>>>> that > >>>>> happens in those lines and only "sees" a (...), thus printing that > >>>>> cryptic complaint. > >>>>> > >>>>> I looked in some places in the OpenMPI code, but I couldn't find > >>>>> "max" being redefined anywhere, but I may be looking in the wrong > >>>>> places. Anyways, the only way of found of compiling OpenMPI was a > >>>>> very ugly hack: I have to go into those files and remove the > >>>>> "std::" > >>>>> before > >>>>> the "max". With that, it all compiled cleanly. > >>>> > >>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use > >>>> std::max. What areas did you find that you needed to change? > >>>> > >>>>> I did try running the tests in the 'tests' directory (with 'make > >>>>> check'), and I didn't get any alarming message, except that in some > > > >>>>> cases (class, threads, peruse) it printed "All 0 tests passed". I > >>>>> got and "All (n) tests passed" (n>0) for asm and datatype. > >>>>> > >>>>> Can anybody comment on the meaning of those test results? Should I > >>>>> be alarmed with the "All 0 tests passed" messages? > >>>> > >>>> No. We don't really maintain the "make check" stuff too well. > >>>> > >>>> -- > >>>> Jeff Squyres > >>>> Cisco Systems > >>>> > >>>> > >>>> > >>>> ------------------------------ > >>>> > >>>> _______________________________________________ > >>>> users mailing list > >>>> us...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> End of users Digest, Vol 1055, Issue 2 > >>>> ************************************** > >>>> > >>> > >>> > >>> > >>> -- > >>> -Rima > >>> <step1>_______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> > >> > >> ------------------------------ > >> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> End of users Digest, Vol 1055, Issue 4 > >> ************************************** > >> > > > > > > > > -- > > -Rima > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > ------------------------------ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > End of users Digest, Vol 1057, Issue 3 > > ************************************** > > > > > > -- > -Rima > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dr. Daniel Gruner dgru...@chem.utoronto.ca Dept. of Chemistry daniel.gru...@utoronto.ca University of Toronto phone: (416)-978-8689 80 St. George Street fax: (416)-978-5325 Toronto, ON M5S 3H6, Canada finger for PGP public key