Re: [O-MPI users] HPL & HPCC: Wedged

2005-10-25 Thread Galen M. Shipman
Hi Troy, Sorry for the delay, I am now able to reproduce this behavior when I do not specify HPL_NO_DATATYPE. If I do specify HPL_NO_DATATYPE the run completes. We will be looking into this now. Thanks, Galen On Oct 21, 2005, at 5:03 PM, Troy Telford wrote: I've been trying out the

Re: [O-MPI users] OpenIB module problem/questions:

2005-11-09 Thread Galen M. Shipman
On Nov 8, 2005, at 6:10 PM, Troy Telford wrote: I decided to try OpenMPI using the 'openib' module, rather than 'mvapi'; however I'm having a bit of difficulty: The test hardware is the same as in my earlier posts, the only software difference is: Linux 2.6.14 (OpenIB 2nd gen IB

Re: [O-MPI users] 1.0rc5 is up

2005-11-11 Thread Galen M. Shipman
The bad: OpenIB frequently crashes with the error: *** [0,1,2][btl_openib_endpoint.c: 135:mca_btl_openib_endpoint_post_send] error posting send request errno says Operation now in progress[0,1,2d [0,1,3][btl_openib_endpoint.c: 135:mca_btl_openib_endpoint_post_send] error posting

Re: [O-MPI users] error creating high priority cq for mthca0

2005-12-06 Thread Galen M. Shipman
Hi Daryl, Sounds like this might be a ulimit issue, what do you get when you run ulimit -l? Also, check out: http://www.open-mpi.org/faq/?category=infiniband Thanks, Galen On Dec 6, 2005, at 10:46 AM, Daryl W. Grunau wrote: Hi, I'm running OMPI 1.1a1r8378 on 2.6.14 + recent OpenIB stack

Re: [O-MPI users] does btl_openib work ?

2006-02-02 Thread Galen M. Shipman
Hi Jean, I just noticed that you are running Quad proc nodes and are using: bench1 slots=4 max-slots=4 in your machine file and you are running the benchmark using only 2 processes via: mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root/machines - np 2 PMB-MPI1 By using slots=4

Re: [O-MPI users] does btl_openib work ?

2006-02-02 Thread Galen M. Shipman
isolate the problem. Thanks, Galen On Feb 2, 2006, at 7:04 PM, Jean-Christophe Hugly wrote: On Thu, 2006-02-02 at 15:19 -0700, Galen M. Shipman wrote: Is it possible for you to get a stack trace where this is hanging? You might try: mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root

Re: [O-MPI users] Open-MPI all-to-all performance

2006-02-03 Thread Galen M. Shipman
Hello Konstantin, By using coll_basic_crossover 8 you are forcing all of your benchmarks to use the basic collectives, which offer poor performance. I ran the skampi Alltoall benchmark with the tuned collectives I get the following results which seem to scale quite well, when I have a

Re: [O-MPI users] direct openib btl and latency

2006-02-10 Thread Galen M. Shipman
I've been working for the MVAPICH project for around three years. Since this thread is discussing MVAPICH, I thought I should post to this thread. Galen's description of MVAPICH is not accurate. MVAPICH uses RDMA for short message to deliver performance benefits to the applications. However,

Re: [OMPI users] Performance of ping-pong using OpenMPI over Infiniband

2006-03-16 Thread Galen M. Shipman
Hi Jean, Take a look here: http://www.open-mpi.org/faq/?category=infiniband#ib- leave-pinned This should improve performance for micro-benchmarks and some applications. Please let mw know if this doesn't solve the issue. Thanks, Galen On Mar 16, 2006, at 10:34 AM, Jean Latour wrote:

Re: [OMPI users] how can I tell for sure that I'm using mvapi

2006-04-13 Thread Galen M. Shipman
Hi Bernie, You may specify which BTLs to use at runtime using an mca parameter: mpirun -np 2 -mca btl self,mvapi ./my_app This specifies to only use self (loopback) and mvapi. You may want to also use sm (shared memory) if you have multi-core or multi-proc.. such as: mpirun -np 2 -mca btl

Re: [OMPI users] gm bandwidth results disappointing

2006-06-13 Thread Galen M. Shipman
Hi Brock, You may wish to try running with the runtime option: -mca mpi_leave_pinned 1 This turns on registration caching and such.. - Galen On Jun 13, 2006, at 8:01 AM, Brock Palen wrote: I ran a test using openmpi-1.0.2 on OSX vs mpich-1.2.6 from mryicom and i get lacking results from

Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-29 Thread Galen M. Shipman
I'm currently working with Owen on this issue.. will continue my findings on list.. - Galen On Jun 29, 2006, at 7:56 AM, Jeff Squyres ((jsquyres)) wrote: Owen -- Sorry, we all fell [way] behind on e-mail because many of us were at an OMPI developer's meeting last week. :-( In the

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-06 Thread Galen M. Shipman
Hey Justin, Please provide us your mca parameters (if any), these could be in a config file, environment variables or on the command line. Thanks, Galen On Jul 6, 2006, at 9:22 AM, Justin Bronder wrote: As far as the nightly builds go, I'm still seeing what I believe to be this problem

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-06 Thread Galen M. Shipman
./xhpl Thanks for the speedy response, Justin. On 7/6/06, Galen M. Shipman < gship...@lanl.gov> wrote: Hey Justin, Please provide us your mca parameters (if any), these could be in a config file, environment variables or on the command line. Thanks, Galen On Jul 6, 2006, at 9:22 AM, Ju

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-11 Thread Galen M. Shipman
nd pasted from an OS X run. I'm about to test against 1.0.3a1r10670. Justin. On 7/6/06, *Galen M. Shipman* < gship...@lanl.gov <mailto:gship...@lanl.gov>> wrote: Justin, Is the OS X run showing the same resi

Re: [OMPI users] Proprieatary transport layer for openMPI...

2006-08-07 Thread Galen M. Shipman
Durga, Currently there are two options for porting an interconnect to Open MPI, one would be to use the BTL interface (Byte Transfer Layer). Another would be to use the MTL (Matching Transport Layer). The difference is that the MTL is useful for those APIs which expose matching and

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-31 Thread Galen M. Shipman
Galen M. Shipman wrote: Gleb Natapov wrote: On Mon, Oct 30, 2006 at 11:45:53AM -0700, Troy Telford wrote: On Sun, 29 Oct 2006 01:34:06 -0700, Gleb Natapov <gl...@voltaire.com> wrote: If you use OB1 PML (default one) it will never recover from link down

Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-20 Thread Galen M. Shipman
m2001(120) > mpirun -np 6 -hostfile hostsfile -mca btl mx,self b_eff This does appear to be a bug, although you are using the MX BTL. Our higher performance path is the MX MTL. To use this try: mpirun -np 6 -hostfile hostsfile -mca pml cm b_eff Also, just for grins, could you try:

Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-21 Thread Galen M. Shipman
Lydia Heck wrote: Thank you very much. I tried mpirun -np 6 -machinefile ./myh -mca pml cm ./b_eff What was the performance (latency and bandwidth)? and to amuse you mpirun -np 6 -machinefile ./myh -mca btl mx,sm,self ./b_eff Same question here as well.. Thanks, Galen with myh

Re: [OMPI users] running with the dr pml.

2006-12-05 Thread Galen M. Shipman
Brock Palen wrote: I was asked by mirycom to run a test using the data reliability pml. (dr) I ran it like so: $ mpirun --mca pml dr -np 4 ./xhpl Is this the right format for running the dr pml? This should be fine, yes. I can running HPL on our test cluster to see if something is