Thanks for your answer, below.  Just so my other question does not
get lost, I will post it again.

I cannot get an 8-proc job to run on an 8-core cluster with openmpi
and petsc.  I loaded mpi4py and petsc4py, and then
I try to run the python script:

from mpi4py import MPI
from petsc4py import PETSc

using

mpirun -n 8 -x PYTHONPATH python test-mpi4py.py

This hangs on my 8-core FC11 box.  Either of the following
allows it to work:

Remove the petsc4py import statement

Run not on localhost, but on two machines in the cluster:
 mpirun -n 8 -host 10.0.0.14,10.0.0.15 -x PYTHONPATH python test-mpi4py.py


It seems as though something (openmpi? rsh?) is limiting the
number of connections per machine, and then that petsc is
requiring additional connections which cause that limit to be
exceeded.

What could be doing this limiting?

Thanks...John Cary



Ralph Castain wrote:
In the 1.3 series and beyond, you have to specifically tell us the name of any hostfile, including the default one for your system. So, in this example, you would want to set:

OMPI_MCA_orte_default_hostfile=absolute-path-to-openmpi-default-hostfile

in your environment, or just add:

-mca default-hostfile path-to-openmpi-default-hostfile

on your cmd line. Check out "man orte_hosts" for a full explanation of how these are used as it has changed from 1.2.

Ralph


On Jul 11, 2009, at 7:21 AM, John R. Cary wrote:

The original problem was that I could not get an 8-proc job to
run on an 8-core cluster.  I loaded mpi4py and petsc4py, and then
I try to run the python script:

from mpi4py import MPI
from petsc4py import PETSc

using

mpirun -n 8 -x PYTHONPATH python test-mpi4py.py

This hangs on my 8-core FC11 box.  Either of the following
allows it to work:

Remove the petsc4py import statement

Run not on localhost, but on two machines in the cluster:
mpirun -n 8 -host 10.0.0.14,10.0.0.15 -x PYTHONPATH python test-mpi4py.py


Curiously, putting

10.0.0.12 slots=4
10.0.0.13 slots=4
10.0.0.14 slots=4
10.0.0.15 slots=4


in openmpi-default-hostfile does not seem to affect anything.

Any idea why?

FYI, I am running over rsh.  The output of ompi_info is appended.

It seems as though something (openmpi? rsh?) is limiting the
number of connections per machine, and then that petsc is
requiring additional connections which exceed that limit.
What could be doing this limiting?

Thanks....John Cary









$ ompi_info
               Package: Open MPI c...@iter.txcorp.com Distribution
              Open MPI: 1.3.2
 Open MPI SVN revision: r21054
 Open MPI release date: Apr 21, 2009
              Open RTE: 1.3.2
 Open RTE SVN revision: r21054
 Open RTE release date: Apr 21, 2009
                  OPAL: 1.3.2
     OPAL SVN revision: r21054
     OPAL release date: Apr 21, 2009
          Ident string: 1.3.2
                Prefix: /usr/local/openmpi-1.3.2-nodlopen
Configured architecture: x86_64-unknown-linux-gnu
        Configure host: iter.txcorp.com
         Configured by: cary
         Configured on: Fri Jul 10 07:12:06 MDT 2009
        Configure host: iter.txcorp.com
              Built by: cary
              Built on: Fri Jul 10 07:42:03 MDT 2009
            Built host: iter.txcorp.com
            C bindings: yes
          C++ bindings: yes
    Fortran77 bindings: yes (all)
    Fortran90 bindings: yes
Fortran90 bindings size: small
            C compiler: gcc
   C compiler absolute: /usr/lib64/ccache/gcc
          C++ compiler: g++
 C++ compiler absolute: /usr/lib64/ccache/g++
    Fortran77 compiler: gfortran
Fortran77 compiler abs: /usr/bin/gfortran
    Fortran90 compiler: gfortran
Fortran90 compiler abs: /usr/bin/gfortran
           C profiling: yes
         C++ profiling: yes
   Fortran77 profiling: yes
   Fortran90 profiling: yes
        C++ exceptions: no
        Thread support: posix (mpi: no, progress: no)
         Sparse Groups: no
Internal debug support: no
   MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
       libltdl support: no
 Heterogeneous support: no
mpirun default --prefix: no
       MPI I/O support: yes
     MPI_WTIME support: gettimeofday
Symbol visibility support: yes
 FT Checkpoint support: no  (checkpoint thread: no)
         MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.2)
            MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.2)
         MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.2)
MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.2)
             MCA carto: file (MCA v2.0, API v2.0, Component v1.3.2)
         MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.2)
             MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.2)
       MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.2)
       MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.2)
               MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.2)
            MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.2)
         MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.2)
         MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: self (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.2)
              MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.2)
                MCA io: romio (MCA v2.0, API v2.0, Component v1.3.2)
             MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.2)
             MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.2)
             MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.2)
               MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.2)
               MCA pml: v (MCA v2.0, API v2.0, Component v1.3.2)
               MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.2)
            MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.2)
               MCA btl: self (MCA v2.0, API v2.0, Component v1.3.2)
               MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.2)
              MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.2)
               MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.2)
               MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.2)
               MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.2)
               MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.2)
               MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.2)
               MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.2)
              MCA odls: default (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ras: tm (MCA v2.0, API v2.0, Component v1.3.2)
             MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.2)
MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.2)
             MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.2)
               MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.2)
            MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.2)
            MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.2)
            MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.2)
               MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.2)
               MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA plm: tm (MCA v2.0, API v2.0, Component v1.3.2)
             MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.2)
            MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ess: env (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.2)
           MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.2)
           MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.2)






_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to