There seems to be a problem with MX, because a conflict between out MTL and the BTL. So, I suspect that if you want it to run [right now] you should spawn less than the MX supported endpoint by node (one less). I'll take a look this afternoon.

  Thanks,
    george.

On Jul 11, 2007, at 12:39 PM, Warner Yuen wrote:

The hostfile was changed around. As we tried to pull nodes out that we thought might have been bad. But none were over subscribed if that's what you mean.

Warner Yuen
Scientific Computing Consultant
Apple Computer



On Jul 11, 2007, at 9:00 AM, users-requ...@open-mpi.org wrote:

Message: 3
Date: Wed, 11 Jul 2007 11:27:47 -0400
From: George Bosilca <bosi...@cs.utk.edu>
Subject: Re: [OMPI users] OMPI users] openmpi fails on mx endpoint
        busy
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <15c9e0ab-6c55-43d9-a40e-82cf973b0...@cs.utk.edu>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed

What's in the hostmx10g file ? How many hosts ?

   george.

On Jul 11, 2007, at 1:34 AM, Warner Yuen wrote:

I've also had someone run into the endpoint busy problem. I never
figured it out, I just increased the default endpoints on MX-10G
from 8 to 16 endpoints to make the problem go away. Here's the
actual command and error before setting the endpoints to 16. The
version is MX-1.2.1with OMPI 1.2.3:

node1:~/taepic tae$ mpirun --hostfile hostmx10g -byslot -mca btl
self,sm,mx -np 12 test_beam_injection test_beam_injection.inp -npx
12 > out12
[node2:00834] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
-------------------------------------------------------------------- --
----
Process 0.1.3 is unable to reach 0.1.7 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to