Can your nodes see the openmpi libraries and executables?  I have the
/usr/local and /opt from the master node mounted on the compute nodes,
in addition to having the LD_LIBRARY_PATH defined correctly.  In your
case the nodes must be able to see /home/rchaud/openmpi-1.2.6 in order
to get the libraries and executables, so this directory must be mounted
on the nodes.  You don't want to copy all this stuff to the nodes in a
bproc environment, since it would eat away at your ram.

Daniel

On Wed, Nov 05, 2008 at 12:44:03PM -0600, Rima Chaudhuri wrote:
> Thanks for all your help Ralph and Sean!!
> I changed the machinefile to just containing the node numbers. I added
> the env variable NODES in my .bash_profile and .bashrc.
> As per Sean's suggestion I added the $LD_LIBRARY_PATH (shared lib path
> which the openmpi lib directory path) and the $AMBERHOME/lib  as 2 of
> the libraries' path in the config file of beowulf. I also checked by
> bpsh from one of the compute nodes whether it can see the executables
> which is in $AMBERHOME/exe and the mpirun(OMPI):
> I get the following error message:
> 
> [rchaud@helios amber10]$ ./step1
> --------------------------------------------------------------------------
> A daemon (pid 25319) launched by the bproc PLS component on node 2 died
> unexpectedly on signal 13 so we are aborting.
> 
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 717
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 1164
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> rmgr_urm.c at line 462
> [helios.structure.uic.edu:25317] mpirun: spawn failed with errno=-1
> 
> 
> I tested to see if the compute nodes could see the master by the
> following commands:
> 
> [rchaud@helios amber10]$ bpsh 2 echo $LD_LIBRARY_PATH
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
> [rchaud@helios amber10]$ bpsh 2 echo $AMBERHOME
> /home/rchaud/Amber10_openmpi/amber10
> [rchaud@helios amber10]$ bpsh 2 ls -al
> total 11064
> drwxr-xr-x   11 rchaud   0            4096 Nov  5 11:33 .
> drwxr-xr-x    3 rchaud   100          4096 Oct 20 17:21 ..
> -rw-r--r--    1 128      53           1201 Jul 10 17:08 Changelog_at
> -rw-rw-r--    1 128      53          25975 Feb 28  2008
> GNU_Lesser_Public_License
> -rw-rw----    1 128      53           3232 Mar 30  2008 INSTALL
> -rw-rw-r--    1 128      53          20072 Feb 11  2008 LICENSE_at
> -rw-r--r--    1 0        0         1814241 Oct 31 13:32 PLP_617_xtal_nolig.crd
> -rw-r--r--    1 0        0         8722770 Oct 31 13:31 PLP_617_xtal_nolig.top
> -rw-rw-r--    1 128      53           1104 Mar 18  2008 README
> -rw-r--r--    1 128      53           1783 Jun 23 19:43 README_at
> drwxrwxr-x   10 128      53           4096 Oct 20 17:23 benchmarks
> drwxr-xr-x    2 0        0            4096 Oct 20 18:21 bin
> -rw-r--r--    1 0        0          642491 Oct 20 17:51 bugfix.all
> drwxr-xr-x   13 0        0            4096 Oct 20 17:37 dat
> drwxr-xr-x    3 0        0            4096 Oct 20 17:23 doc
> drwxrwxr-x    9 128      53           4096 Oct 20 17:23 examples
> lrwxrwxrwx    1 0        0               3 Oct 20 17:34 exe -> bin
> drwxr-xr-x    2 0        0            4096 Oct 20 17:35 include
> drwxr-xr-x    2 0        0            4096 Oct 20 17:36 lib
> -rw-r--r--    1 rchaud   100            30 Nov  5 11:33 machinefile
> -rw-r--r--    1 rchaud   100           161 Nov  5 12:11 min
> drwxrwxr-x   40 128      53           4096 Oct 20 17:50 src
> -rwxr-xr-x    1 rchaud   100           376 Nov  3 16:41 step1
> drwxrwxr-x  114 128      53           4096 Oct 20 17:23 test
> 
> [rchaud@helios amber10]$ bpsh 2 which mpirun
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/bin/mpirun
> 
> The $LD_LIBRARY_PATH seems to be defined correctly, but then why is it
> not being read?
> 
> thanks
> 
> On Wed, Nov 5, 2008 at 11:08 AM,  <users-requ...@open-mpi.org> wrote:
> > Send users mailing list submissions to
> >        us...@open-mpi.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >        http://www.open-mpi.org/mailman/listinfo.cgi/users
> > or, via email, send a message with subject or body 'help' to
> >        users-requ...@open-mpi.org
> >
> > You can reach the person managing the list at
> >        users-ow...@open-mpi.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of users digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: Beowulf cluster and openmpi (Kelley, Sean)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Wed, 5 Nov 2008 12:08:13 -0500
> > From: "Kelley, Sean" <sean.kel...@solers.com>
> > Subject: Re: [OMPI users] Beowulf cluster and openmpi
> > To: "Open MPI Users" <us...@open-mpi.org>
> > Message-ID:
> >        <A804E989DCC5234FBA6C4E62B939978F2EB3D5@ava-es5.solers.local>
> > Content-Type: text/plain;       charset="us-ascii"
> >
> > I would suggest making sure that the /etc/beowulf/config file has a
> > "libraries" line for every directory where required shared libraries
> > (application and mpi) are located.
> >
> > Also, make sure that the filesystems containing the executables and
> > shared libraries are accessible from the compute nodes.
> >
> > Sean
> >
> > -----Original Message-----
> > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> > Behalf Of Rima Chaudhuri
> > Sent: Monday, November 03, 2008 5:50 PM
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] Beowulf cluster and openmpi
> >
> > I added the option for -hostfile machinefile where the machinefile is a
> > file with the IP of the nodes:
> > #host names
> > 192.168.0.100 slots=2
> > 192.168.0.101 slots=2
> > 192.168.0.102 slots=2
> > 192.168.0.103 slots=2
> > 192.168.0.104 slots=2
> > 192.168.0.105 slots=2
> > 192.168.0.106 slots=2
> > 192.168.0.107 slots=2
> > 192.168.0.108 slots=2
> > 192.168.0.109 slots=2
> >
> >
> > [rchaud@helios amber10]$ ./step1
> > ------------------------------------------------------------------------
> > --
> > A daemon (pid 29837) launched by the bproc PLS component on node 192
> > died unexpectedly so we are aborting.
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> > the location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> > ------------------------------------------------------------------------
> > --
> > [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file
> > pls_bproc.c at line 717 [helios.structure.uic.edu:29836] [0,0,0]
> > ORTE_ERROR_LOG: Error in file pls_bproc.c at line 1164
> > [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file
> > rmgr_urm.c at line 462 [helios.structure.uic.edu:29836] mpirun: spawn
> > failed with errno=-1
> >
> > I used bpsh to see if the master and one of the nodes n8 could see the
> > $LD_LIBRARY_PATH, and it does..
> >
> > [rchaud@helios amber10]$ echo $LD_LIBRARY_PATH
> > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
> >
> > [rchaud@helios amber10]$ bpsh n8 echo $LD_LIBRARY_PATH
> > /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
> >
> > thanks!
> >
> >
> > On Mon, Nov 3, 2008 at 3:14 PM,  <users-requ...@open-mpi.org> wrote:
> >> Send users mailing list submissions to
> >>        us...@open-mpi.org
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >>        http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> or, via email, send a message with subject or body 'help' to
> >>        users-requ...@open-mpi.org
> >>
> >> You can reach the person managing the list at
> >>        users-ow...@open-mpi.org
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of users digest..."
> >>
> >>
> >> Today's Topics:
> >>
> >>   1. Re: Problems installing in Cygwin - Problem with GCC      3.4.4
> >>      (Jeff Squyres)
> >>   2. switch from mpich2 to openMPI <newbie question> (PattiMichelle)
> >>   3. Re: users Digest, Vol 1055, Issue 2 (Ralph Castain)
> >>
> >>
> >> ----------------------------------------------------------------------
> >>
> >> Message: 1
> >> Date: Mon, 3 Nov 2008 15:52:22 -0500
> >> From: Jeff Squyres <jsquy...@cisco.com>
> >> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem with
> >>        GCC     3.4.4
> >> To: "Gustavo Seabra" <gustavo.sea...@gmail.com>
> >> Cc: Open MPI Users <us...@open-mpi.org>
> >> Message-ID: <a016b8c4-510b-4fd2-ad3b-a1b644050...@cisco.com>
> >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>
> >> On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote:
> >>
> >>>> For your fortran issue, the Fortran 90 interface needs the Fortran
> >>>> 77 interface.  So you need to supply an F77 as well (the output from
> >
> >>>> configure should indicate that the F90 interface was disabled
> >>>> because the F77 interface was disabled).
> >>>
> >>> Is that what you mean (see below)?
> >>
> >> Ah yes -- that's another reason the f90 interface could be disabled:
> >> if configure detects that the f77 and f90 compilers are not link-
> >> compatible.
> >>
> >>> I thought the g95 compiler could
> >>> deal with F77 as well as F95... If so, could I just pass F77='g95'?
> >>
> >> That would probably work (F77=g95).  I don't know the g95 compiler at
> >> all, so I don't know if it also accepts Fortran-77-style codes.  But
> >> if it does, then you're set.  Otherwise, specify a different F77
> >> compiler that is link compatible with g95 and you should be good.
> >>>>> I looked in some places in the OpenMPI code, but I couldn't find
> >>>>> "max" being redefined anywhere, but I may be looking in the wrong
> >>>>> places. Anyways, the only way of found of compiling OpenMPI was a
> >>>>> very ugly hack: I have to go into those files and remove the
> >>>>> "std::"
> >>>>> before
> >>>>> the "max". With that, it all compiled cleanly.
> >>>>
> >>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use
> >>>> std::max.
> >>>> What areas did you find that you needed to change?
> >>>
> >>> These files are part of the standard C++ headers. In my case, they
> >>> sit in:
> >>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits
> >>
> >> Ah, I see.
> >>
> >>> In principle, the problems that comes from those files would mean
> >>> that the OpenMPI source has some macro redefining max, but that's
> >>> what I could not find :-(
> >>
> >> Gotcha.  I don't think we are defining a "max" macro anywhere in the
> >> ompi_info source or related header files.  :-(
> >>
> >>>> No.  We don't really maintain the "make check" stuff too well.
> >>>
> >>> Oh well... What do you use for testing the implementation?
> >>
> >>
> >> We have a whole pile of MPI tests in a private SVN repository.  The
> >> repository is only private because it contains a lot of other people's
> >
> >> [public] MPI test suites and benchmarks, and we never looked into
> >> redistribution rights for their software.  There's nothing really
> >> secret about it -- we just haven't bothered to look into the IP
> >> issues.  :-)
> >>
> >> We use the MPI Testing Tool (MTT) for nightly regression across the
> >> community:
> >>
> >>     http://www.open-mpi.org/mtt/
> >>
> >> We have weekday and weekend testing schedules.  M-Th we do nightly
> >> tests; F-Mon morning, we do a long weekend schedule.  This weekend,
> >> for example, we ran about 675k regression tests:
> >>
> >>     http://www.open-mpi.org/mtt/index.php?do_redir=875
> >>
> >> --
> >> Jeff Squyres
> >> Cisco Systems
> >>
> >>
> >>
> >> ------------------------------
> >>
> >> Message: 2
> >> Date: Mon, 03 Nov 2008 12:59:59 -0800
> >> From: PattiMichelle <mic...@earthlink.net>
> >> Subject: [OMPI users] switch from mpich2 to openMPI <newbie question>
> >> To: us...@open-mpi.org, patti.sheaf...@aero.org
> >> Message-ID: <490f664f.4000...@earthlink.net>
> >> Content-Type: text/plain; charset="iso-8859-1"
> >>
> >> I just found out I need to switch from mpich2 to openMPI for some code
> >
> >> I'm running.  I noticed that it's available in an openSuSE repo (I'm
> >> using openSuSE 11.0 x86_64 on a TYAN 32-processor Opteron 8000
> >> system), but when I was using mpich2 I seemed to have better luck
> >> compiling it from code.  This is the line I used:
> >>
> >> # $ F77=/path/to/g95 F90=/path/to/g95 ./configure
> >> --prefix=/some/place/mpich2-install
> >>
> >> But usually I left the "--prefix=" off and just let it install to it's
> >
> >> default...  which is /usr/local/bin and that's nice because it's
> >> already in the PATH and very usable.  I guess my question is whether
> >> or not the defaults and configuration syntax have stayed the same in
> >> openMPI.  I also could use a "quickstart" guide for a non-programming
> >> user (e.g., I think I have to start a daemon before running
> > parallelized programs).
> >>
> >> THANKS!!!
> >> PattiM.
> >> -------------- next part -------------- HTML attachment scrubbed and
> >> removed
> >>
> >> ------------------------------
> >>
> >> Message: 3
> >> Date: Mon, 3 Nov 2008 14:14:36 -0700
> >> From: Ralph Castain <r...@lanl.gov>
> >> Subject: Re: [OMPI users] users Digest, Vol 1055, Issue 2
> >> To: Open MPI Users <us...@open-mpi.org>
> >> Message-ID: <2fbdf4dc-b2df-4486-a644-0f18c96e8...@lanl.gov>
> >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>
> >> The problem is that you didn't specify or allocate any nodes for the
> >> job. At the least, you need to tell us what nodes to use via a
> > hostfile.
> >>
> >> Alternatively, are you using a resource manager to assign the nodes?
> >> OMPI didn't see anything from one, but it could be that we just didn't
> >
> >> see the right envar.
> >>
> >> Ralph
> >>
> >> On Nov 3, 2008, at 1:39 PM, Rima Chaudhuri wrote:
> >>
> >>> Thanks a lot Ralph!
> >>> I corrected the no_local to nolocal and now when I try to execute the
> >
> >>> script step1 (pls find it attached) [rchaud@helios amber10]$ ./step1
> >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Not
> >>> available in file ras_bjs.c at line 247
> >>> ---------------------------------------------------------------------
> >>> ----- There are no available nodes allocated to this job. This could
> >>> be because no nodes were found or all the available nodes were
> >>> already used.
> >>>
> >>> Note that since the -nolocal option was given no processes can be
> >>> launched on the local node.
> >>> ---------------------------------------------------------------------
> >>> ----- [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG:
> >>> Temporarily out of resource in file base/rmaps_base_support_fns.c at
> >>> line 168 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG:
> >>> Temporarily out of resource in file rmaps_rr.c at line 402
> >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
> >>> out of resource in file base/rmaps_base_map_job.c at line 210
> >>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
> >>> out of resource in file rmgr_urm.c at line 372
> >>> [helios.structure.uic.edu:16335] mpirun: spawn failed with errno=-3
> >>>
> >>>
> >>>
> >>> If I use the script without the --nolocal option, I get the following
> >
> >>> error:
> >>> [helios.structure.uic.edu:20708] [0,0,0] ORTE_ERROR_LOG: Not
> >>> available in file ras_bjs.c at line 247
> >>>
> >>>
> >>> thanks,
> >>>
> >>>
> >>> On Mon, Nov 3, 2008 at 2:04 PM,  <users-requ...@open-mpi.org> wrote:
> >>>> Send users mailing list submissions to
> >>>>       us...@open-mpi.org
> >>>>
> >>>> To subscribe or unsubscribe via the World Wide Web, visit
> >>>>       http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>> or, via email, send a message with subject or body 'help' to
> >>>>       users-requ...@open-mpi.org
> >>>>
> >>>> You can reach the person managing the list at
> >>>>       users-ow...@open-mpi.org
> >>>>
> >>>> When replying, please edit your Subject line so it is more specific
> >>>> than "Re: Contents of users digest..."
> >>>>
> >>>>
> >>>> Today's Topics:
> >>>>
> >>>>  1. Scyld Beowulf and openmpi (Rima Chaudhuri)  2. Re: Scyld Beowulf
> >
> >>>> and openmpi (Ralph Castain)  3. Problems installing in Cygwin -
> >>>> Problem with GCC 3.4.4
> >>>>     (Gustavo Seabra)
> >>>>  4. Re: MPI + Mixed language coding(Fortran90 + C++) (Jeff Squyres)
> >>>>  5. Re: Problems installing in Cygwin - Problem with GCC      3.4.4
> >>>>     (Jeff Squyres)
> >>>>
> >>>>
> >>>> --------------------------------------------------------------------
> >>>> --
> >>>>
> >>>> Message: 1
> >>>> Date: Mon, 3 Nov 2008 11:30:01 -0600
> >>>> From: "Rima Chaudhuri" <rima.chaudh...@gmail.com>
> >>>> Subject: [OMPI users] Scyld Beowulf and openmpi
> >>>> To: us...@open-mpi.org
> >>>> Message-ID:
> >>>>       <7503b17d0811030930i13acb974kc627983a1d481...@mail.gmail.com>
> >>>> Content-Type: text/plain; charset=ISO-8859-1
> >>>>
> >>>> Hello!
> >>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our
> >>>> x86_64 linux scyld beowulf cluster inorder to make it run with
> >>>> amber10 MD simulation package.
> >>>>
> >>>> The nodes can see the home directory i.e. a bpsh to the nodes works
> >>>> fine and lists all the files in the home directory where I have both
> >
> >>>> openmpi and amber10 installed.
> >>>> However if I try to run:
> >>>>
> >>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI
> >>>> ........
> >>>>
> >>>> I get the following error:
> >>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247
> >>>> --------------------------------------------------------------------
> >>>> ------ Failed to find the following executable:
> >>>>
> >>>> Host:       helios.structure.uic.edu
> >>>> Executable: -o
> >>>>
> >>>> Cannot continue.
> >>>> --------------------------------------------------------------------
> >>>> ------ [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not
> >>>> found in file rmgr_urm.c at line 462
> >>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13
> >>>>
> >>>> any cues?
> >>>>
> >>>>
> >>>> --
> >>>> -Rima
> >>>>
> >>>>
> >>>> ------------------------------
> >>>>
> >>>> Message: 2
> >>>> Date: Mon, 3 Nov 2008 12:08:36 -0700
> >>>> From: Ralph Castain <r...@lanl.gov>
> >>>> Subject: Re: [OMPI users] Scyld Beowulf and openmpi
> >>>> To: Open MPI Users <us...@open-mpi.org>
> >>>> Message-ID: <91044a7e-ada5-4b94-aa11-b3c1d9843...@lanl.gov>
> >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>>>
> >>>> For starters, there is no "-no_local" option to mpirun. You might
> >>>> want to look at mpirun --help, or man mpirun.
> >>>>
> >>>> I suspect the option you wanted was --nolocal. Note that --nolocal
> >>>> does not take an argument.
> >>>>
> >>>> Mpirun is confused by the incorrect option and looking for an
> >>>> incorrectly named executable.
> >>>> Ralph
> >>>>
> >>>>
> >>>> On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote:
> >>>>
> >>>>> Hello!
> >>>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our
> >>>>> x86_64 linux scyld beowulf cluster inorder to make it run with
> >>>>> amber10 MD simulation package.
> >>>>>
> >>>>> The nodes can see the home directory i.e. a bpsh to the nodes works
> >
> >>>>> fine and lists all the files in the home directory where I have
> >>>>> both openmpi and amber10 installed.
> >>>>> However if I try to run:
> >>>>>
> >>>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI
> >>>>> ........
> >>>>>
> >>>>> I get the following error:
> >>>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247
> >>>>> -------------------------------------------------------------------
> >>>>> ------- Failed to find the following executable:
> >>>>>
> >>>>> Host:       helios.structure.uic.edu
> >>>>> Executable: -o
> >>>>>
> >>>>> Cannot continue.
> >>>>> -------------------------------------------------------------------
> >>>>> ------- [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG:
> >>>>> Not found in file rmgr_urm.c at line 462
> >>>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with
> >>>>> errno=-13
> >>>>>
> >>>>> any cues?
> >>>>>
> >>>>>
> >>>>> --
> >>>>> -Rima
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> us...@open-mpi.org
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>>
> >>>>
> >>>> ------------------------------
> >>>>
> >>>> Message: 3
> >>>> Date: Mon, 3 Nov 2008 14:53:55 -0500
> >>>> From: "Gustavo Seabra" <gustavo.sea...@gmail.com>
> >>>> Subject: [OMPI users] Problems installing in Cygwin - Problem with
> >>>> GCC
> >>>>       3.4.4
> >>>> To: "Open MPI Users" <us...@open-mpi.org>
> >>>> Message-ID:
> >>>>       <f79359b60811031153l5591e0f8j49a7e4d9fb02e...@mail.gmail.com>
> >>>> Content-Type: text/plain; charset=ISO-8859-1
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>> Here's a "progress report"... more questions in the end :-)
> >>>>
> >>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using the
> >>>> following configure command:
> >>>>
> >>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \
> >>>>               --with-mpi-param_check=always --with-threads=posix \
> >>>>               --enable-mpi-threads --disable-io-romio \
> >>>>               --enable-mca-no-
> >>>> build=memory_mallopt,maffinity,paffinity \
> >>>>               --enable-contrib-no-build=vt \
> >>>>               FC=g95 'FFLAGS=-O0  -fno-second-underscore' CXX=g++
> >>>>
> >>>> I then had a very weird error during compilation of
> >>>> ompi/tools/ompi_info/params.cc. (See below).
> >>>>
> >>>> The lines causing the compilation errors are:
> >>>>
> >>>> vector.tcc:307:      const size_type __len = __old_size +
> >>>> std::max(__old_size, __n);
> >>>> vector.tcc:384:      const size_type __len = __old_size +
> >>>> std::max(__old_size, __n);
> >>>> stl_bvector.h:522:  const size_type __len = size() +
> >>>> std::max(size(), __n);
> >>>> stl_bvector.h:823:  const size_type __len = size() +
> >>>> std::max(size(), __n);
> >>>>
> >>>> (Notice that those are from the standard gcc libraries.)
> >>>>
> >>>> After googling it for a while, I could find that this error is
> >>>> caused because, at come point, the source code being compiled
> >>>> redefined the "max" function with a macro, g++ cannot recognize the
> >>>> "std::max" that happens in those lines and only "sees" a (...), thus
> >
> >>>> printing that cryptic complaint.
> >>>>
> >>>> I looked in some places in the OpenMPI code, but I couldn't find
> >>>> "max" being redefined anywhere, but I may be looking in the wrong
> >>>> places. Anyways, the only way of found of compiling OpenMPI was a
> >>>> very ugly hack: I have to go into those files and remove the "std::"
> >>>> before
> >>>> the "max". With that, it all compiled cleanly.
> >>>>
> >>>> I did try running the tests in the 'tests' directory (with 'make
> >>>> check'), and I didn't get any alarming message, except that in some
> >>>> cases (class, threads, peruse) it printed "All 0 tests passed". I
> >>>> got and "All (n) tests passed" (n>0) for asm and datatype.
> >>>>
> >>>> Can anybody comment on the meaning of those test results? Should I
> >>>> be alarmed with the "All 0 tests passed" messages?
> >>>>
> >>>> Finally, in the absence of big red flags (that I noticed), I went
> >>>> ahead and tried to compile my program. However, as soon as
> >>>> compilation starts, I get the following:
> >>>>
> >>>> /local/openmpi/openmpi-1.3b1/bin/mpif90 -c -O3  -fno-second-
> >>>> underscore -ffree-form  -o constants.o _constants.f
> >>>> --------------------------------------------------------------------
> >>>> ------ Unfortunately, this installation of Open MPI was not compiled
> >
> >>>> with Fortran 90 support.  As such, the mpif90 compiler is
> >>>> non-functional.
> >>>> --------------------------------------------------------------------
> >>>> ------
> >>>> make[1]: *** [constants.o] Error 1
> >>>> make[1]: Leaving directory `/home/seabra/local/amber11/src/sander'
> >>>> make: *** [parallel] Error 2
> >>>>
> >>>> Notice that I compiled OpenMPI with g95, so there *should* be
> >>>> Fortran95 support... Any ideas on what could be going wrong?
> >>>>
> >>>> Thank you very much,
> >>>> Gustavo.
> >>>>
> >>>> ======================================
> >>>> Error in the compilation of params.cc
> >>>> ======================================
> >>>> $ g++ -DHAVE_CONFIG_H -I. -I../../../opal/include
> >>>> -I../../../orte/include -I../../../ompi/include
> >>>> -I../../../opal/mca/paffinity/linux/plpa/src/libplpa
> >>>> -DOMPI_CONFIGURE_USER="\"seabra\"" -DOMPI_CONFIGURE_HOST="\"ACS02\""
> >>>> -DOMPI_CONFIGURE_DATE="\"Sat Nov  1 20:44:32 EDT 2008\""
> >>>> -DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\""
> >>>> -DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O3 -DNDEBUG
> >>>> -finline-functions -fno-strict-aliasing \""
> >>>> -DOMPI_BUILD_CPPFLAGS="\"-I../../..  -D_REENTRANT\""
> >>>> -DOMPI_BUILD_CXXFLAGS="\"-O3 -DNDEBUG -finline-functions \""
> >>>> -DOMPI_BUILD_CXXCPPFLAGS="\"-I../../..  -D_REENTRANT\""
> >>>> -DOMPI_BUILD_FFLAGS="\"-O0  -fno-second-underscore\""
> >>>> -DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\"-export-dynamic
> >>>> \"" -DOMPI_BUILD_LIBS="\"-lutil  \""
> >>>> -DOMPI_CC_ABSOLUTE="\"/usr/bin/gcc\""
> >>>> -DOMPI_CXX_ABSOLUTE="\"/usr/bin/g++\""
> >>>> -DOMPI_F77_ABSOLUTE="\"/usr/bin/g77\""
> >>>> -DOMPI_F90_ABSOLUTE="\"/usr/local/bin/g95\""
> >>>> -DOMPI_F90_BUILD_SIZE="\"small\"" -I../../..  -D_REENTRANT  -O3
> >>>> -DNDEBUG -finline-functions  -MT param.o -MD -MP -MF $depbase.Tpo -c
> >
> >>>> -o param.o param.cc In file included from
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/
> >>>> vector:72,
> >>>>                from ../../../ompi/tools/ompi_info/ompi_info.h:24,
> >>>>                from param.cc:43:
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In
> >
> >>>> member function `void std::vector<bool,
> >>>> _Alloc>::_M_insert_range(std::_Bit_iterator, _ForwardIterator,
> >>>> _ForwardIterator, std::forward_iterator_tag)':
> >>>>
> > /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:522:
> >>>> error: expected unqualified-id before '(' token
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In
> >
> >>>> member function `void std::vector<bool,
> >>>> _Alloc>::_M_fill_insert(std::_Bit_iterator, size_t, bool)':
> >>>>
> > /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:823:
> >>>> error: expected unqualified-id before '(' token In file included
> >>>> from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/
> >>>> vector:75,
> >>>>                from ../../../ompi/tools/ompi_info/ompi_info.h:24,
> >>>>                from param.cc:43:
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In
> >>>> member function `void std::vector<_Tp,
> >>>> _Alloc>::_M_fill_insert(__gnu_cxx::__normal_iterator<typename
> >>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, size_t, const _Tp&)':
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:307:
> >>>> error: expected unqualified-id before '(' token
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In
> >>>> member function `void std::vector<_Tp,
> >>>> _Alloc>::_M_range_insert(__gnu_cxx::__normal_iterator<typename
> >>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, _ForwardIterator,
> >>>> _ForwardIterator, std::forward_iterator_tag)':
> >>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:384:
> >>>> error: expected unqualified-id before '(' token
> >>>>
> >>>>
> >>>> --
> >>>> Gustavo Seabra
> >>>> Postdoctoral Associate
> >>>> Quantum Theory Project - University of Florida Gainesville - Florida
> >
> >>>> - USA
> >>>>
> >>>>
> >>>> ------------------------------
> >>>>
> >>>> Message: 4
> >>>> Date: Mon, 3 Nov 2008 14:54:25 -0500
> >>>> From: Jeff Squyres <jsquy...@cisco.com>
> >>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C+
> >>>> +)
> >>>> To: Open MPI Users <us...@open-mpi.org>
> >>>> Message-ID: <45698801-0857-466f-a19d-c529f72d4...@cisco.com>
> >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>>>
> >>>> Can you replicate the scenario in smaller / different cases?
> >>>>
> >>>> - write a sample plugin in C instead of C++
> >>>> - write a non-MPI Fortran application that loads your C++
> >>>> application
> >>>> - ...?
> >>>>
> >>>> In short, *MPI* shouldn't be interfering with Fortran/C++ common
> >>>> blocks.  Try taking MPI out of the picture and see if that makes the
> >
> >>>> problem go away.
> >>>>
> >>>> Those are pretty much shots in the dark, but I don't know where to
> >>>> go, either -- try random things until you find what you want.
> >>>>
> >>>>
> >>>> On Nov 3, 2008, at 3:51 AM, Rajesh Ramaya wrote:
> >>>>
> >>>>> Helllo Jeff, Gustavo, Mi
> >>>>>   Thank for the advice. I am familiar with the difference in the
> >>>>> compiler code generation for C, C++ & FORTRAN. I even tried to look
> >
> >>>>> at some of the common block symbols. The name of the symbol remains
> >
> >>>>> the same. The only difference that I observe is in FORTRAN compiled
> >
> >>>>> *.o  0000000000515bc0 B aux7loc_  and the C++ compiled code  U
> >>>>> aux7loc_  the memory is not allocated as it has been declared as
> >>>>> extern in C++. When the executable loads the shared library it
> >>>>> finds all the undefined symbols. Atleast if it did not manage to
> >>>>> find a single symbol it prints undefined symbol error.
> >>>>> I am completely stuck up and do not know how to continue further.
> >>>>>
> >>>>> Thanks,
> >>>>> Rajesh
> >>>>>
> >>>>> From: users-boun...@open-mpi.org
> >>>>> [mailto:users-boun...@open-mpi.org]
> >>>>> On Behalf Of Mi Yan
> >>>>> Sent: samedi 1 novembre 2008 23:26
> >>>>> To: Open MPI Users
> >>>>> Cc: 'Open MPI Users'; users-boun...@open-mpi.org
> >>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C
> >>>>> ++)
> >>>>>
> >>>>> So your tests show:
> >>>>> 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works.
> >>>>> 2. "Shared library in C++ + MPI executable in FORTRAN " does not
> >>>>> work.
> >>>>>
> >>>>> It seems to me that the symbols in C library are not really
> >>>>> recognized by FORTRAN executable as you thought. What compilers did
> >
> >>>>> yo use to built OpenMPI?
> >>>>>
> >>>>> Different compiler has different convention to handle symbols. E.g.
> >>>>> if there is a variable "var_foo" in your FORTRAN code, some FORTRN
> >>>>> compiler will save "var_foo_" in the object file by default; if you
> >
> >>>>> want to access "var_foo" in C code, you actually need to refer
> >>>>> "var_foo_" in C code. If you define "var_foo" in a module in the
> >>>>> FORTAN compiler, some FORTRAN compiler may append the module name
> >>>>> to "var_foo".
> >>>>> So I suggest to check the symbols in the object files generated by
> >>>>> your FORTAN and C compiler to see the difference.
> >>>>>
> >>>>> Mi
> >>>>> <image001.gif>"Rajesh Ramaya" <rajesh.ram...@e-xstream.com>
> >>>>>
> >>>>>
> >>>>> "Rajesh Ramaya" <rajesh.ram...@e-xstream.com> Sent by:
> >>>>> users-boun...@open-mpi.org
> >>>>> 10/31/2008 03:07 PM
> >>>>>
> >>>>> Please respond to
> >>>>> Open MPI Users <us...@open-mpi.org> <image002.gif> To
> >>>>> <image003.gif> "'Open MPI Users'" <us...@open-mpi.org>, "'Jeff
> >>>>> Squyres'" <jsquy...@cisco.com
> >>>>>>
> >>>>> <image002.gif>
> >>>>> cc
> >>>>> <image003.gif>
> >>>>> <image002.gif>
> >>>>> Subject
> >>>>> <image003.gif>
> >>>>> Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
> >>>>>
> >>>>> <image003.gif>
> >>>>> <image003.gif>
> >>>>>
> >>>>> Hello Jeff Squyres,
> >>>>>  Thank you very much for the immediate reply. I am able to
> >>>>> successfully access the data from the common block but the values
> >>>>> are zero. In my algorithm I even update a common block but the
> >>>>> update made by the shared library is not taken in to account by the
> >
> >>>>> executable. Can you please be very specific how to make the
> >>>>> parallel algorithm aware of the data?
> >>>>> Actually I am
> >>>>> not writing any MPI code inside? It's the executable (third party
> >>>>> software)
> >>>>> who does that part. All that I am doing is to compile my code with
> >>>>> MPI c compiler and add it in the LD_LIBIRARY_PATH.
> >>>>> In fact I did a simple test by creating a shared library using a
> >>>>> FORTRAN code and the update made to the common block is taken in to
> >
> >>>>> account by the executable. Is there any flag or pragma that need to
> >
> >>>>> be activated for mixed language MPI?
> >>>>> Thank you once again for the reply.
> >>>>>
> >>>>> Rajesh
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: users-boun...@open-mpi.org
> >>>>> [mailto:users-boun...@open-mpi.org]
> >>>>> On
> >>>>> Behalf Of Jeff Squyres
> >>>>> Sent: vendredi 31 octobre 2008 18:53
> >>>>> To: Open MPI Users
> >>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C
> >>>>> ++)
> >>>>>
> >>>>> On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote:
> >>>>>
> >>>>>>    I am completely new to MPI. I have a basic question concerning
> >>>>>> MPI and mixed language coding. I hope any of you could help me
> > out.
> >>>>>> Is it possible to access FORTRAN common blocks in C++ in a MPI
> >>>>>> compiled code. It works without MPI but as soon I switch to MPI
> >>>>>> the access of common block does not work anymore.
> >>>>>> I have a Linux MPI executable which loads a shared library at
> >>>>>> runtime and resolves all undefined symbols etc  The shared library
> >
> >>>>>> is written in C++ and the MPI executable in written in FORTRAN.
> >>>>>> Some
> >>>>>> of the input that the shared library looking for are in the
> >>>>>> Fortran common blocks. As I access those common blocks during
> >>>>>> runtime the values are not  initialized.  I would like to know if
> >>>>>> what I am doing is possible ?I hope that my problem is clear......
> >>>>>
> >>>>>
> >>>>> Generally, MPI should not get in the way of sharing common blocks
> >>>>> between Fortran and C/C++.  Indeed, in Open MPI itself, we share a
> >>>>> few common blocks between Fortran and the main C Open MPI
> >>>>> implementation.
> >>>>>
> >>>>> What is the exact symptom that you are seeing?  Is the application
> >>>>> failing to resolve symbols at run-time, possibly indicating that
> >>>>> something hasn't instantiated a common block?  Or are you able to
> >>>>> successfully access the data from the common block, but it doesn't
> >>>>> have the values you expect (e.g., perhaps you're seeing all zeros)?
> >>>>>
> >>>>> If the former, you might want to check your build procedure.  You
> >>>>> *should* be able to simply replace your C++ / F90 compilers with
> >>>>> mpicxx and mpif90, respectively, and be able to build an MPI
> >>>>> version of your app.  If the latter, you might need to make your
> >>>>> parallel algorithm aware of what data is available in which MPI
> >>>>> process -- perhaps not all the data is filled in on each MPI
> > process...?
> >>>>>
> >>>>> --
> >>>>> Jeff Squyres
> >>>>> Cisco Systems
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> us...@open-mpi.org
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> us...@open-mpi.org
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> us...@open-mpi.org
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>>
> >>>> --
> >>>> Jeff Squyres
> >>>> Cisco Systems
> >>>>
> >>>>
> >>>>
> >>>> ------------------------------
> >>>>
> >>>> Message: 5
> >>>> Date: Mon, 3 Nov 2008 15:04:47 -0500
> >>>> From: Jeff Squyres <jsquy...@cisco.com>
> >>>> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem
> >>>> with
> >>>>       GCC     3.4.4
> >>>> To: Open MPI Users <us...@open-mpi.org>
> >>>> Message-ID: <8e364b51-6726-4533-ade2-aea266380...@cisco.com>
> >>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >>>>
> >>>> On Nov 3, 2008, at 2:53 PM, Gustavo Seabra wrote:
> >>>>
> >>>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using the
> >
> >>>>> following configure command:
> >>>>>
> >>>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \
> >>>>>               --with-mpi-param_check=always --with-threads=posix \
> >>>>>               --enable-mpi-threads --disable-io-romio \
> >>>>>               --enable-mca-no-
> >>>>> build=memory_mallopt,maffinity,paffinity \
> >>>>>               --enable-contrib-no-build=vt \
> >>>>>               FC=g95 'FFLAGS=-O0  -fno-second-underscore' CXX=g++
> >>>>
> >>>> For your fortran issue, the Fortran 90 interface needs the Fortran
> >>>> 77 interface.  So you need to supply an F77 as well (the output from
> >
> >>>> configure should indicate that the F90 interface was disabled
> >>>> because the F77 interface was disabled).
> >>>>
> >>>>> I then had a very weird error during compilation of
> >>>>> ompi/tools/ompi_info/params.cc. (See below).
> >>>>>
> >>>>> The lines causing the compilation errors are:
> >>>>>
> >>>>> vector.tcc:307:      const size_type __len = __old_size +
> >>>>> std::max(__old_size, __n);
> >>>>> vector.tcc:384:      const size_type __len = __old_size +
> >>>>> std::max(__old_size, __n);
> >>>>> stl_bvector.h:522:  const size_type __len = size() +
> >>>>> std::max(size(), __n);
> >>>>> stl_bvector.h:823:  const size_type __len = size() +
> >>>>> std::max(size(), __n);
> >>>>>
> >>>>> (Notice that those are from the standard gcc libraries.)
> >>>>>
> >>>>> After googling it for a while, I could find that this error is
> >>>>> caused because, at come point, the source code being compiled
> >>>>> redefined the "max" function with a macro, g++ cannot recognize the
> >
> >>>>> "std::max"
> >>>>> that
> >>>>> happens in those lines and only "sees" a (...), thus printing that
> >>>>> cryptic complaint.
> >>>>>
> >>>>> I looked in some places in the OpenMPI code, but I couldn't find
> >>>>> "max" being redefined anywhere, but I may be looking in the wrong
> >>>>> places. Anyways, the only way of found of compiling OpenMPI was a
> >>>>> very ugly hack: I have to go into those files and remove the
> >>>>> "std::"
> >>>>> before
> >>>>> the "max". With that, it all compiled cleanly.
> >>>>
> >>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use
> >>>> std::max.  What areas did you find that you needed to change?
> >>>>
> >>>>> I did try running the tests in the 'tests' directory (with 'make
> >>>>> check'), and I didn't get any alarming message, except that in some
> >
> >>>>> cases (class, threads, peruse) it printed "All 0 tests passed". I
> >>>>> got and "All (n) tests passed" (n>0) for asm and datatype.
> >>>>>
> >>>>> Can anybody comment on the meaning of those test results? Should I
> >>>>> be alarmed with the "All 0 tests passed" messages?
> >>>>
> >>>> No.  We don't really maintain the "make check" stuff too well.
> >>>>
> >>>> --
> >>>> Jeff Squyres
> >>>> Cisco Systems
> >>>>
> >>>>
> >>>>
> >>>> ------------------------------
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> us...@open-mpi.org
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>> End of users Digest, Vol 1055, Issue 2
> >>>> **************************************
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> -Rima
> >>> <step1>_______________________________________________
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> ------------------------------
> >>
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> End of users Digest, Vol 1055, Issue 4
> >> **************************************
> >>
> >
> >
> >
> > --
> > -Rima
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > ------------------------------
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > End of users Digest, Vol 1057, Issue 3
> > **************************************
> >
> 
> 
> 
> -- 
> -Rima
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Dr. Daniel Gruner                        dgru...@chem.utoronto.ca
Dept. of Chemistry                       daniel.gru...@utoronto.ca
University of Toronto                    phone:  (416)-978-8689
80 St. George Street                     fax:    (416)-978-5325
Toronto, ON  M5S 3H6, Canada             finger for PGP public key

Reply via email to