Re: [OMPI users] orted: command not found

2007-01-03 Thread Ralph H Castain
Hi Jose

Sorry for entering the discussion late. From tracing the email thread, I
somewhat gather the following:

1. you have installed Open MPI 1.1.2 on two 686 boxes

2. you created a hostfile on one of the nodes and execute mpirun from that
node. You gave us a prefix indicating where we should find the Open MPI
executables on each node

3. you were getting an error message indicating that mpirun was unable to
find your executable

4. you didn't encounter this problem when running on a cluster

If I have those facts correct, then the problem is simple. My guess is that
the cluster you were using has a shared file system - hence, the remote
nodes "see" your executable in the same relative location across the
cluster.

In your simple setup with the two boxes, it sounds like you don't have a
shared file system. When mpirun attempts to locate the executable on
bernie-3, it won't find the file since it doesn't exist on that node's file
system. Once you copied the file over to bernie-3, then mpirun could find it
so everything works fine.

We hope to add file pre-positioning at some point in the future for systems
such as yours. However, that is some time away due to priorities. For now,
Open MPI requires that your executable (and the Open MPI executables and
libraries) be available on each node you are trying to use.

Hope that helps to explain the problem.
Ralph


On 1/2/07 2:03 PM, "jcolmena...@ula.ve"  wrote:

> I had configured the hostfile located at
> ~prefix/etc/openmpi-default-hostfile.
> 
> I copied the file to bernie-3, and it worked...
> 
> Now, at the cluster I was working at the Universidad de Los Andes
> (Venezuela) -I decided to install mpi on three machines I was able to put
> together as a personal proyect- all I had to do was to compile and run my
> applications, that is, I never copied any file to any other machine...
> now, I had to. I'm sorry if it was obvious and made you guys loose some
> time, but why on a cluster I didn't have to copy any files, and now I must
> do so?
> 
> Thanks for you patiance!
> 
> Jose
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
I had configured the hostfile located at
~prefix/etc/openmpi-default-hostfile.

I copied the file to bernie-3, and it worked...

Now, at the cluster I was working at the Universidad de Los Andes
(Venezuela) -I decided to install mpi on three machines I was able to put
together as a personal proyect- all I had to do was to compile and run my
applications, that is, I never copied any file to any other machine...
now, I had to. I'm sorry if it was obvious and made you guys loose some
time, but why on a cluster I didn't have to copy any files, and now I must
do so?

Thanks for you patiance!

Jose



Re: [OMPI users] orted: command not found

2007-01-02 Thread Gurhan Ozen

On 1/2/07, Gurhan Ozen  wrote:

On 1/2/07, jcolmena...@ula.ve  wrote:
> > First you should make sure that PATH and LD_LIBRARY_PATH are defined
> > in the section of your .bashrc file that get parsed for non
> > interactive sessions. Run "mpirun -np 1 printenv" and check if PATH
> > and LD_LIBRARY_PATH have the values you expect.
>
> in fact they do:
>
> bernie@bernie-1:~/proyecto$ mpirun -np 1 printenv
> SHELL=/bin/bash
> SSH_CLIENT=192.168.1.142 4109 22
> USER=bernie
> LD_LIBRARY_PATH=/usr/local/openmpi/lib:/usr/local/openmpi/lib:
> MAIL=/var/mail/bernie
> 
PATH=/usr/local/openmpi/bin:/usr/local/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/games
> PWD=/home/bernie
> LANG=en_US.UTF-8
> HISTCONTROL=ignoredups
> SHLVL=1
> HOME=/home/bernie
> MPI_DIR=/usr/local/openmpi
> LOGNAME=bernie
> SSH_CONNECTION=192.168.1.142 4109 192.168.1.113 22
> LESSOPEN=| /usr/bin/lesspipe %s
> LESSCLOSE=/usr/bin/lesspipe %s %s
> _=/usr/local/openmpi/bin/orted
> OMPI_MCA_universe=bernie@bernie-1:default-universe
> OMPI_MCA_ns_nds=env
> OMPI_MCA_ns_nds_vpid_start=0
> OMPI_MCA_ns_nds_num_procs=1
> OMPI_MCA_mpi_paffinity_processor=0
> OMPI_MCA_ns_replica_uri=0.0.0;tcp://192.168.1.142:4775
> OMPI_MCA_gpr_replica_uri=0.0.0;tcp://192.168.1.142:4775
> OMPI_MCA_orte_base_nodename=192.168.1.113
> OMPI_MCA_ns_nds_cellid=0
> OMPI_MCA_ns_nds_jobid=1
> OMPI_MCA_ns_nds_vpid=0
>
>
> > For your second question you should give the path to your prueba.bin
> > executable. I'll do something like "mpirun --prefix /usr/local/
> > openmpi -np 2 ./prueba.bin". The reason is that usually "." is not in
> > the PATH.
> >
>
> bernie@bernie-1:~/proyecto$ mpirun --prefix /usr/local/openmpi -np 2
> ./prueba.bin
> --
> Failed to find or execute the following executable:
>
> Host:   bernie-3
> Executable: ./prueba.bin
>
> Cannot continue.
> --
>
> and the file IS there:
>
> bernie@bernie-1:~/proyecto$ ls prueba*
> prueba.bin  prueba.f90  prueba.f90~
>


   Wait a minute.. you are running mpirun from bernie-1 without
proving any hostfile or hostnames .. So both processes should be
running on bernie-1 host, yet the error says it can't find the
executable on bernie-3. Why is this? Make sure that the file exists on
bernie-3 and is executable.

  gurhan


>
> I must be missing something pretty silly, but have been looking around for
> days to no avail!
>

   What are the permissions on the file? Is it an executable file?

   gurhan

> Jose
>
> thanks
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
it is executable

bernie@bernie-1:~/proyecto$ ls -l prueba.bin
-rwxr-xr-x 1 bernie bernie 9619 2007-01-02 12:18 prueba.bin




Re: [OMPI users] orted: command not found

2007-01-02 Thread Gurhan Ozen

On 1/2/07, jcolmena...@ula.ve  wrote:

> First you should make sure that PATH and LD_LIBRARY_PATH are defined
> in the section of your .bashrc file that get parsed for non
> interactive sessions. Run "mpirun -np 1 printenv" and check if PATH
> and LD_LIBRARY_PATH have the values you expect.

in fact they do:

bernie@bernie-1:~/proyecto$ mpirun -np 1 printenv
SHELL=/bin/bash
SSH_CLIENT=192.168.1.142 4109 22
USER=bernie
LD_LIBRARY_PATH=/usr/local/openmpi/lib:/usr/local/openmpi/lib:
MAIL=/var/mail/bernie
PATH=/usr/local/openmpi/bin:/usr/local/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/games
PWD=/home/bernie
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
SHLVL=1
HOME=/home/bernie
MPI_DIR=/usr/local/openmpi
LOGNAME=bernie
SSH_CONNECTION=192.168.1.142 4109 192.168.1.113 22
LESSOPEN=| /usr/bin/lesspipe %s
LESSCLOSE=/usr/bin/lesspipe %s %s
_=/usr/local/openmpi/bin/orted
OMPI_MCA_universe=bernie@bernie-1:default-universe
OMPI_MCA_ns_nds=env
OMPI_MCA_ns_nds_vpid_start=0
OMPI_MCA_ns_nds_num_procs=1
OMPI_MCA_mpi_paffinity_processor=0
OMPI_MCA_ns_replica_uri=0.0.0;tcp://192.168.1.142:4775
OMPI_MCA_gpr_replica_uri=0.0.0;tcp://192.168.1.142:4775
OMPI_MCA_orte_base_nodename=192.168.1.113
OMPI_MCA_ns_nds_cellid=0
OMPI_MCA_ns_nds_jobid=1
OMPI_MCA_ns_nds_vpid=0


> For your second question you should give the path to your prueba.bin
> executable. I'll do something like "mpirun --prefix /usr/local/
> openmpi -np 2 ./prueba.bin". The reason is that usually "." is not in
> the PATH.
>

bernie@bernie-1:~/proyecto$ mpirun --prefix /usr/local/openmpi -np 2
./prueba.bin
--
Failed to find or execute the following executable:

Host:   bernie-3
Executable: ./prueba.bin

Cannot continue.
--

and the file IS there:

bernie@bernie-1:~/proyecto$ ls prueba*
prueba.bin  prueba.f90  prueba.f90~


I must be missing something pretty silly, but have been looking around for
days to no avail!



  What are the permissions on the file? Is it an executable file?

  gurhan


Jose

thanks


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
I installed openmpi 1.1.2 on two 686 boxes runing ubuntu 6.10.
Followed the instructions given in the FAQ. Nevertheless, I get the
following message:

[bernie-1:05053] ERROR: A daemon on node 192.168.1.113 failed to start as
expected.
[bernie-1:05053] ERROR: There may be more information available from
[bernie-1:05053] ERROR: the remote shell (see above).
[bernie-1:05053] ERROR: The daemon exited unexpectedly with status 127.

now, I've been browsing the web, including the mailing lists, and it
appears that the error should be that I have not declared the variables

export PATH="/usr/local/openmpi/bin:${PATH}"
export LD_LIBRARY_PATH="/usr/local/openmpi/lib:${LD_LIBRARY_PATH}"

at the node, wich I have. I have even created all the posible folders
proposed at the FAQ for remote loggins, although I'm using bash.

If I do a ssh user@remote_node, I can connect without being asked for a
password, and if I type mpif90, I get: "gfortran: no input files", wich
should mean that indeed the PATH and LD_LIBRARY_PATH are being updated on
the remote logging.

But, if I do:

bash$  mpirun --prefix /usr/local/openmpi -np 2 prueba.bin

the result is:

--
Failed to find the following executable:

Host:   bernie-3
Executable: prueba.bin

Cannot continue.
--
mpirun noticed that job rank 0 with PID 0 on node "192.168.1.113" exited
on signal 4.

I've been looking around, but have not been able to find what does the
signal 4 means.

Just in case, I was running an example program wich runs fine at my
university cluster. Nevertheless, decided to run an even simpler one, wich
I include, for it may be that the error is there (I definitly hope
not!...)

program test

  use mpi

  implicit none

  integer :: myid,sizze,ierr

  call MPI_INIT(ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,sizze,ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr)

  print *,"I'm using ",sizze," processors"
  print *,"of wich I'm the number ",myid

  call MPI_FINALIZE(ierr)

end program test


This is the first time I have installed -and use- any parallel programing
program or library, and I'm doing it as a personal proyect for a graduate
curse, so any help will be greatly appreciated!

Best regards

Jose Colmenares