Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-19 Thread Caird, Andrew J

Glad to hear that worked for you.

Full credit goes to Brock Palen who told me about this.  It turns out we also 
have a user who wanted to do that.  And meta-credit goes to the OMPI developers 
for making a consistent and flexible set of MPI tools and libraries.

--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of 
> pat.o'bry...@exxonmobil.com
> Sent: Wednesday, December 19, 2007 9:37 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> 
> Andrew,
>  That worked like a champ. Now my users can have it both 
> ways. For the
> record, my control statements looked like the following:
> 
> /opt/openmpi-1.2.4/bin/mpirun -mca pls ^tm -np $NP -hostfile 
> $PBS_NODEFILE
> $my_binary_path
> 
> My job works just fine and reports no errors. This version of 
> OpenMPI was
> built with "--with-tm=/usr/local/pbs".
> 
>   Thanks for your help,
>Pat
> 
> 
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
> 
> 
> 
>   
>  
>  "Caird, Andrew   
>  
>  J"   
>  
>  <acaird@umich.   
>   To 
>  edu> "Open MPI Users"
>  
>  Sent by: <us...@open-mpi.org>
>  
>  users-bounces@   
>   cc 
>  open-mpi.org 
> <users-boun...@open-mpi.org> 
>   
>  Subject 
>   Re: [OMPI users] Torque 
> and OpenMPI  
>  12/19/07 07:59   1.2 
>  
>  AM   
>  
>   
>  
>   
>  
>  Please respond   
>  
>to 
>  
>  Open MPI Users   
>  
>  <users@open-mp   
>  
>  i.org>   
>  
>   
>  
>   
>  
>   
>  
>   
>  
> 
> 
> 
> 
> oops, I meant -mca, not -mcs
> 
> 
> > -Original Message-
> > From: users-boun...@open-mpi.org
> > [mailto:users-boun...@open-mpi.org] On Behalf Of Caird, Andrew J
> > Sent: Wednesday, December 19, 2007 8:57 AM
> > To: Open MPI Users
> > Cc: users-boun...@open-mpi.org
> > Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> >
> > Does OMPI built with TM but run with:
> >-mcs pls ^tm
> >
> > give the same effect?
> >
> > --andy
> >
> >
> > > -Original Message-
> > > From: users-boun...@open-mpi.org
> > > [mailto:users-boun...@open-mpi.org] On Behalf Of
> > > pat.o'bry...@exxonmobil.com
> > > Sent: Wednesday, December 19, 2007 8:47 AM
> > > To: Open MPI Users
> > > Cc: Open MPI Users; users-boun...@open-mpi.org
> > > Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> > >
> > > Terry,
> > > Your suggestion worked. So long as I specifically state
> > > "--without-tm",
> > > the OpenMPI 1.2.4 build allows the use of "-hostfile".
> > Apparently, by
> > > default, OpenMPI 1.2.4 will incorporate Torque if it
> > exists, so it is
> > > necessary to specifically request "no Torque support".  I
> > > used the normal
> > > Torque processes to submit the job and specified "-hostfile
> > > $PBS_NODEFILE".
> > > Everything worked.
> > >Thanks for your help,
> > > Pat
> > >
> > > J.W. (Pat) O'Bryant,Jr.
&g

Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-19 Thread Caird, Andrew J
oops, I meant -mca, not -mcs


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Caird, Andrew J
> Sent: Wednesday, December 19, 2007 8:57 AM
> To: Open MPI Users
> Cc: users-boun...@open-mpi.org
> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> 
> Does OMPI built with TM but run with:
>-mcs pls ^tm
> 
> give the same effect?
> 
> --andy
>   
> 
> > -Original Message-
> > From: users-boun...@open-mpi.org 
> > [mailto:users-boun...@open-mpi.org] On Behalf Of 
> > pat.o'bry...@exxonmobil.com
> > Sent: Wednesday, December 19, 2007 8:47 AM
> > To: Open MPI Users
> > Cc: Open MPI Users; users-boun...@open-mpi.org
> > Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> > 
> > Terry,
> > Your suggestion worked. So long as I specifically state 
> > "--without-tm",
> > the OpenMPI 1.2.4 build allows the use of "-hostfile". 
> Apparently, by
> > default, OpenMPI 1.2.4 will incorporate Torque if it 
> exists, so it is
> > necessary to specifically request "no Torque support".  I 
> > used the normal
> > Torque processes to submit the job and specified "-hostfile 
> > $PBS_NODEFILE".
> > Everything worked.
> >Thanks for your help,
> > Pat
> > 
> > J.W. (Pat) O'Bryant,Jr.
> > Business Line Infrastructure
> > Technical Systems, HPC
> > Office: 713-431-7022
> > 
> > 
> > 
> >   
> >  
> >  Terry
> >  
> >  Frankcombe   
> >  
> >  <te...@chem.gu   
> >   To 
> >  .se> Open MPI Users 
> > <us...@open-mpi.org>  
> >  Sent by: 
> >   cc 
> >  users-bounces@   
> >  
> >  open-mpi.org 
> >  Subject 
> >   Re: [OMPI users] Torque 
> > and OpenMPI  
> >   1.2 
> >  
> >  12/18/07 01:45   
> >  
> >  PM   
> >  
> >   
> >  
> >   
> >  
> >  Please respond   
> >  
> >to 
> >  
> >  Open MPI Users   
> >  
> >  <users@open-mp   
> >  
> >  i.org>   
> >  
> >   
> >  
> >   
> >  
> >   
> >  
> >   
> >  
> > 
> > 
> > 
> > 
> > On Tue, 2007-12-18 at 11:59 -0700, Ralph H Castain wrote:
> > > Hate to be a party-pooper, but the answer is "no" in 
> OpenMPI 1.2. We
> > don't
> > > allow the use of a hostfile in a Torque environment in 
> that version.
> > >
> > > We have changed this for v1.3, but you'll have to wait for 
> > that release.
> > 
> > 
> > Can one not build OpenMPI without tm support and spawn remote 
> > jobs using
> > the other mechanisms, using only $PBS_NODEFILE (or a 
> derivative of the
> > file that that points to) in the script?
> > 
> > Ciao
> > Terry
> > 
> > 
> > --
> > Dr Terry Frankcombe
> > Physical Chemistry, Department of Chemistry
> > Göteborgs Universitet
> > SE-412 96 Göteborg Sweden
> > Ph: +46 76 224 0887   Skype: terry.frankcombe
> > <te...@chem.gu.se>
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-19 Thread Caird, Andrew J
Does OMPI built with TM but run with:
   -mcs pls ^tm

give the same effect?

--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of 
> pat.o'bry...@exxonmobil.com
> Sent: Wednesday, December 19, 2007 8:47 AM
> To: Open MPI Users
> Cc: Open MPI Users; users-boun...@open-mpi.org
> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
> 
> Terry,
> Your suggestion worked. So long as I specifically state 
> "--without-tm",
> the OpenMPI 1.2.4 build allows the use of "-hostfile". Apparently, by
> default, OpenMPI 1.2.4 will incorporate Torque if it exists, so it is
> necessary to specifically request "no Torque support".  I 
> used the normal
> Torque processes to submit the job and specified "-hostfile 
> $PBS_NODEFILE".
> Everything worked.
>Thanks for your help,
> Pat
> 
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
> 
> 
> 
>   
>  
>  Terry
>  
>  Frankcombe   
>  
>     To 
>  .se> Open MPI Users 
>   
>  Sent by: 
>   cc 
>  users-bounces@   
>  
>  open-mpi.org 
>  Subject 
>   Re: [OMPI users] Torque 
> and OpenMPI  
>   1.2 
>  
>  12/18/07 01:45   
>  
>  PM   
>  
>   
>  
>   
>  
>  Please respond   
>  
>to 
>  
>  Open MPI Users   
>  
>    
>  i.org>   
>  
>   
>  
>   
>  
>   
>  
>   
>  
> 
> 
> 
> 
> On Tue, 2007-12-18 at 11:59 -0700, Ralph H Castain wrote:
> > Hate to be a party-pooper, but the answer is "no" in OpenMPI 1.2. We
> don't
> > allow the use of a hostfile in a Torque environment in that version.
> >
> > We have changed this for v1.3, but you'll have to wait for 
> that release.
> 
> 
> Can one not build OpenMPI without tm support and spawn remote 
> jobs using
> the other mechanisms, using only $PBS_NODEFILE (or a derivative of the
> file that that points to) in the script?
> 
> Ciao
> Terry
> 
> 
> --
> Dr Terry Frankcombe
> Physical Chemistry, Department of Chemistry
> Göteborgs Universitet
> SE-412 96 Göteborg Sweden
> Ph: +46 76 224 0887   Skype: terry.frankcombe
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] Fwd: R npRmpi

2007-12-18 Thread Caird, Andrew J

Dr. Yu sent me a version of this intended for OpenMPI back in September.
I was just today getting around to trying it, although I noticed that it
doesn't work with R v2.6, so my plans just changed a little.

If Dr. Yu gives permission, I'll send to you what he sent to me, or
perhaps he'll post it to this list.

--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Randy Heiland
> Sent: Tuesday, December 18, 2007 4:08 PM
> To: us...@open-mpi.org
> Cc: hpa-ad...@iu.edu
> Subject: [OMPI users] Fwd: R npRmpi
> 
> The pkg in question is here:  http://www.stats.uwo.ca/faculty/yu/Rmpi/
> 
> The question is:  has anyone on this list got OpenMPI working 
> for this pkg?  Any suggestions?
> 
> thanks, Randy
> 
> 
> 
> 
> Begin forwarded message:
> 
> 
>   
>   
>   Subject: R npRmpi
> 
>   Been looking into the npRmpi problem
> 
>   I can get a segfault executing
> 
>   mpi.spawn.Rslaves()
> 
> 
>   The documentation .html files under npRmpi contains the 
> following:
> 
>   "mpi.spawn.Rslaves to spawn R slaves on selected hosts. This is
>   a LAM-MPI specific function."
> 
> 
>   lamhosts()
> 
>   sh: lamnodes: command not found
> 
>   The documentation for nearly all mpi.xxx.xxx calls send you to
>   www.lam-mpi.org for more information.
> 
>   Looks for all the world this package depends on LAM-MPI which
>   is not installed on Quarry. I don't think pointing the build
>   at an OpenMPI install will help. The .c sources will compile
>   just fine but when R goes to use them it refers to LAM-MPI
>   dependent functions and behaves  badly.
> 
> 
> 
> 



Re: [OMPI users] using both 64 and 32 bit mpi

2006-09-28 Thread Caird, Andrew J

Glenn,

If you're careful with $PATH and $LD_LIBRARY_PATH you can certainly do
this.  One thing that makes this a little easier is the 'modules'
package (http://modules.sourceforge.net/).  We use this to maintain 9
versions of OpenMPI for various reasons, along with 5 versions of LAM
and a version of MPICH.  It does a nice job of keeping everything
separate and our uses have grasped it pretty quickly.

Good luck.
--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Glenn Johnson
> Sent: Thursday, September 28, 2006 3:04 PM
> To: us...@open-mpi.org
> Subject: [OMPI users] using both 64 and 32 bit mpi
> 
> I have an 8-way AMD64 system. I built a 64 bit open-mpi-1.1 
> implementation and then compiled software to use it. That all 
> works fine.
> 
> In addition, I have a 32 bit binary program (Schrodinger 
> Jaguar) that I would like to run on this machine with mpi. 
> Schrodinger provides source code to build an mpi 
> compatibility layer. This compatibility layer allows jaguar 
> to use a different mpi implementation than that which the 
> software was compiled with. I do not want to give up the 64 
> bit open-mpi that I already have and am using.
> 
> So my questions are:
>  1. Can I build/install a 32 bit version of open-mpi even though I
> already have a 64 bit version installed?
>  2. What "tricks" might I need to do to make sure a program calls
> the correct version of mpi (32 or 64 bit)?
>  3. Would I do better considering running jaguar in a 32 
> bit chroot
> environment?
> 
> Thanks.
> 
> --
> Glenn Johnson  USDA, ARS, SRRC
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] Jumbo frames

2006-08-25 Thread Caird, Andrew J
Massimiliano,

It should work automatically, but I have seen instances where switches
or Ethernet cards can't support the full 9000 bytes per frame, and we've
had to go as low as 6000 bytes to get consistent performance.  It seems
like everyone's interpretation of what the 9000 bytes is for is a little
different.

Does it work with the defaults 1500byte setting?  You might try
increasing in smaller steps to see where it stops working.

Good luck.
--andrew


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Massimiliano Fatica
> Sent: Friday, August 25, 2006 1:30 AM
> To: us...@open-mpi.org
> Subject: [OMPI users] Jumbo frames
> 
> Hi,
> I am trying to use Jumbo frames but mpirun will not start the job.
> I am using OpenMPI v1.1 shipped with the latest Rocks (4.2).
> Ifconfig is reporting that all the NIC on the cluster are 
> using an MTU of 9000 and the switch (HP Procurve) should be 
> able to use Jumbo frames.
> 
> Is there any special flag I need to pass to mpirun or a 
> configuration file I need to edit?
> 
> Thanks
> Massimiliano
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] TM fixes on trunk

2006-07-17 Thread Caird, Andrew J
That's excellent, thanks.

--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres 
> (jsquyres)
> Sent: Monday, July 17, 2006 2:08 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] TM fixes on trunk
> 
> For lack of a longer explanation, let's call it "internal 
> accounting errors" :-).  In an attempt to speed up the TM 
> launcher, we made some changes in Open MPI 1.1 which ended up 
> using the TM API the wrong way.
> So it was clearly a bug.  It *might* work in 1.1, but I 
> wouldn't recommend it (i.e., it's a timing issue -- sometimes 
> it might work, sometimes it might not).
> 
> More specifically -- if it's working for you, then it will 
> probably continue to work for you.  
> 
> There is a 1.1.1b2 tarball currently available 
> (http://www.open-mpi.org/software/ompi/v1.1/), and there are 
> nightly snapshots of the 1.1 branch available as well 
> (http://www.open-mpi.org/nightly/v1.1/).  
> 
> You can see a full list of the changes in the 1.1 branch in 
> the "1.1.1"
> section of NEWS:
> 
>   http://svn.open-mpi.org/svn/ompi/trunk/NEWS
>  
> 
> > -----Original Message-
> > From: users-boun...@open-mpi.org
> > [mailto:users-boun...@open-mpi.org] On Behalf Of Caird, Andrew J
> > Sent: Monday, July 17, 2006 11:10 AM
> > To: Open MPI Users
> > Subject: Re: [OMPI users] TM fixes on trunk
> > 
> > Jeff,
> > 
> > What were the details of the problem/fixes?  
> > 
> > Is it worth us moving to the trunk or using what we have 
> until 1.1.1 
> > arrives?
> > 
> > Thanks.
> > --andy
> >   
> > 
> > > -Original Message-
> > > From: users-boun...@open-mpi.org
> > > [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres
> > > (jsquyres)
> > > Sent: Monday, July 17, 2006 10:22 AM
> > > To: Open MPI Users
> > > Subject: [OMPI users] TM fixes on trunk
> > > 
> > > All --
> > > 
> > > Martin Schaffoner reported some TM problems to this list a little 
> > > while ago.  It took a long time for he and I to synch up, but we 
> > > finally identified and fixed the problem.  This only affects Open 
> > > MPI 1.1 installs -- it is not an issue for 1.0.x 
> installs.  The fix 
> > > has been included in both the trunk and the 1.1 branch, 
> and will be 
> > > included in the upcoming
> > > 1.1.1 release.
> > > 
> > > --
> > > Jeff Squyres
> > > Server Virtualization Business Unit
> > > Cisco Systems
> > > 
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] TM fixes on trunk

2006-07-17 Thread Caird, Andrew J
Jeff,

What were the details of the problem/fixes?  

Is it worth us moving to the trunk or using what we have until 1.1.1
arrives?

Thanks.
--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres 
> (jsquyres)
> Sent: Monday, July 17, 2006 10:22 AM
> To: Open MPI Users
> Subject: [OMPI users] TM fixes on trunk
> 
> All --
> 
> Martin Schaffoner reported some TM problems to this list a 
> little while ago.  It took a long time for he and I to synch 
> up, but we finally identified and fixed the problem.  This 
> only affects Open MPI 1.1 installs -- it is not an issue for 
> 1.0.x installs.  The fix has been included in both the trunk 
> and the 1.1 branch, and will be included in the upcoming 
> 1.1.1 release.
> 
> --
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



[OMPI users] OpenMPI, debugging, and Portland Group's pgdbg

2006-06-13 Thread Caird, Andrew J
Hello all,

I've read the thread "OpenMPI debugging support"
(http://www.open-mpi.org/community/lists/users/2005/11/0370.php) and it
looks like there is improved debugging support for debuggers other than
TV in the 1.1 series.

I'd like to use Portland Groups pgdbg.  It's a parallel debugger,
there's more information at http://www.pgroup.com/resources/docs.htm.

>From the previous thread on this topic, it looks to me like the plan for
1.1 and forward is to support the ability to launch the debugger "along
side" the application.  I don't know enough about either pgdbg or
OpenMPI to know if this is the best plan, but assuming that it is, is
there a way to see if it is happening?

I've tried this two ways, the first way doesn't seem to attach to
anything:


[acaird@nyx-login ~]$ ompi_info | head -2
Open MPI: 1.1a9r10177
   Open MPI SVN revision: r10177
[acaird@nyx-login ~]$ mpirun --debugger pgdbg --debug  -np 2 cpi
PGDBG 6.1-3 x86-64 (Cluster, 64 CPU)
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2005, STMicroelectronics, Inc. All Rights Reserved.
PGDBG cannot open a window; check the DISPLAY environment variable.
Entering text mode.

pgdbg> list
ERROR: No current thread.

pgdbg> quit



and I've tried running the whole thing under pgdbg:


[acaird@nyx-login ~]$ pgdbg mpirun -np 2 cpi -s pgdbgscript
  { lots of mca_* loaded by ld-linux messages }
pgserv 8726: attach : attach 8720 fails
ERROR: New Process (PID 8720, HOST localhost) ATTACH FAILED.
ERROR: New Process (PID 8720, HOST localhost) IGNORED.
ERROR: cannot read value at address 0x59BFE8.
ERROR: cannot read value at address 0x59BFF0.
ERROR: cannot read value at address 0x59BFF8.
ERROR: New Process (PID 0, HOST unknown) IGNORED.
ERROR: cannot read value at address 0x2A959BBEC8.


and it hangs right there until I kill it.  The two variables in this
scenario are:
PGRSH=ssh and the contents of pgdbgscript are:


pgienv exe force
pgienv mode process
ignore 12
run



So, the short list of questions are:

1. Has anyone done this successfully before?
2. Am I making the right assumptions about how the debugger attaches to
the processes?
3. Is this the expected behavior for this set of options to mpirun?
4. Does anyone have any suggestions for other things I might try?

Thanks a lot.
--andy