Re: [OMPI users] automatically creating a machinefile

2012-07-04 Thread Dominik Goeddeke
no idea of Rocks, but with PBS and SLURM, I always do this directly in 
the job submission script. Below is an example of an admittedly 
spaghetti-code script that does this -- assuming proper (un)commenting 
--  for PBS and SLURM and OpenMPI and MPICH2, for one particular machine 
that I have been toying around with lately ...


Dominik

#!/bin/bash

 PBS
#PBS -N feast
#PBS -l nodes=25:ppn=2
#PBS -q batch
#PBS -l walltime=2:00:00
#job should not rerun if it fails
#PBS -r n

### SLURM
# @ job_name = feaststrong1
# @ initialdir = .
# @ output = feaststrong1_%j.out
# @ error = feaststrong1_%j.err
# @ total_tasks = 50
# @ cpus_per_task = 1
# @ wall_clock_limit = 2:00:00

# modules
module purge
module load gcc/4.6.2
module load openmpi/1.5.4
#module load mpich2/1.4.1

# cd into wdir
cd $HOME/feast/feast/feast/applications/poisson_coproc


# PBS with MPICH2
# create machine files to isolate the master process
#cat $PBS_NODEFILE > nodes.txt
## extract slaves
#sort -u  nodes.txt > temp.txt
#lines=`wc -l temp.txt | awk '{print $1}'`
#((lines=$lines - 1))
#tail -n $lines temp.txt > slavetemp.txt
#cat slavetemp.txt | awk '{print $0 ":2"}' > slaves.txt
## extract master
#head -n 1 temp.txt > mastertemp.txt
#cat mastertemp.txt | awk '{print $0 ":1"}' > master.txt
## merge into one dual nodefile
#cat master.txt > dual.hostfile
#cat slaves.txt >> dual.hostfile
## same for single hostfile
#tail -n $lines temp.txt > slavetemp.txt
#cat slavetemp.txt | awk '{print $0 ":1"}' > slaves.txt
## extract master
#head -n 1 temp.txt > mastertemp.txt
#cat mastertemp.txt | awk '{print $0 ":1"}' > master.txt
## merge into one single nodefile
#cat master.txt > single.hostfile
#cat slaves.txt >> single.hostfile
## and clean up
#rm -f slavetemp.txt mastertemp.txt master.txt slaves.txt temp.txt nodes.txt

# 4 nodes
#mpiexec -n 7 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np007.dat

#mkdir arm-strongscaling-series1-L8-nodes04
#mv feastlog.* arm-strongscaling-series1-L8-nodes04

# 7 nodes
#mpiexec -n 13 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np013.dat

#mkdir arm-strongscaling-series1-L8-nodes07
#mv feastlog.* arm-strongscaling-series1-L8-nodes07

# 13 nodes
#mpiexec -n 25 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np025.dat

#mkdir arm-strongscaling-series1-L8-nodes13
#mv feastlog.* arm-strongscaling-series1-L8-nodes13

# 25 nodes
#mpiexec -n 49 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np049.dat

#mkdir arm-strongscaling-series1-L8-nodes25
#mv feastlog.* arm-strongscaling-series1-L8-nodes25


## SLURM

# figure out which nodes we got
srun /bin/hostname | sort > availhosts3.txt

lines=`wc -l availhosts3.txt | awk '{print $1}'`
((lines=$lines - 2))
tail -n $lines availhosts3.txt > slaves3.txt
head -n 1 availhosts3.txt > master3.txt
cat master3.txt > hostfile3.txt
cat slaves3.txt >> hostfile3.txt
# DGDG: SLURM -m arbitrary not supported by OpenMPI
#export SLURM_HOSTFILE=./hostfile3.txt


# 4 nodes
#mpirun -np 7 --hostfile hostfile3.txt ./trace.sh ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np007.dat
mpirun -np 7 --hostfile hostfile3.txt ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np007.dat
#mpiexec -n 7 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np007.dat
#srun -n 7 -m arbitrary ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np007.dat

mkdir arm-strongscaling-series1-L8-nodes04
mv feastlog.* arm-strongscaling-series1-L8-nodes04

# 7 nodes
#mpirun -np 13 --hostfile hostfile3.txt ./trace.sh ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np013.dat
mpirun -np 13 --hostfile hostfile3.txt ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np013.dat
#mpiexec -n 13 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np013.dat
#srun -n 13 -m arbitrary ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np013.dat

mkdir arm-strongscaling-series1-L8-nodes07
mv feastlog.* arm-strongscaling-series1-L8-nodes07

# 13 nodes
#mpirun -np 25 --hostfile hostfile3.txt ./trace.sh ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np025.dat
mpirun -np 25 --hostfile hostfile3.txt ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np025.dat
#mpiexec -n 25 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np025.dat
#srun -n 25 -m arbitrary ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np025.dat

mkdir arm-strongscaling-series1-L8-nodes13
mv feastlog.* arm-strongscaling-series1-L8-nodes13

# 25 nodes
#mpirun -np 49 --hostfile hostfile3.txt ./trace.sh ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np049.dat
mpirun -np 49 --hostfile hostfile3.txt ./feastgpu-ompi 
master.dat.strongscaling.m6.L8.np049.dat
#mpiexec -n 49 -f dual.hostfile ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np049.dat
#srun -n 49 -m arbitrary ./feastgpu-mpich2 
master.dat.strongscaling.m6.L8.np049.dat

mkdir arm-strongscaling-series1-L8-nodes25
mv feastlog.* arm-strongscaling-series1-L8-nodes25


Re: [OMPI users] Getting MPI to access processes on a 2nd computer.

2012-07-04 Thread Shiqing Fan

Hi,

The Open MPI potentially uses WMI to launch remote processes, so the WMI 
has to be configured correctly. There are two links talking about how to 
set it up in README.WINDOWS file:


http://msdn.microsoft.com/en-us/library/aa393266(VS.85).aspx
http://community.spiceworks.com/topic/578

For testing whether it works or not, you can use following command:
wmic /node:remote_node_ip process call create notepad.exe

then log onto the other Windows, check in the task manager if the 
notepad.exe process is created (don't forget to delete it afterwards).


If that works, this command will also work:
mpirun -np 2 -host host1 host2 notepad.exe

Please try to run the above two test commands, if they all works you 
application should also work. Just let me know if you have any question 
or trouble with that.



Shiqing

On 2012-07-03 8:53 PM, vimalmat...@eaton.com wrote:


Hi,

I'm trying to run an MPI code using processes on a remote machine.

I've connected the 2 machines using a crossover cable and they are 
communicating with each other(I'm getting ping replies and I can 
access drives on one another).


When I run mpiexec --host /system_name/ MPI_Test.exe, I get the 
following error:


C:\OpenMPI\openmpi-1.6\build\Debug>mpiexec -host SOUMIWHP4500449 
MPI_Test.exe


connecting to SOUMIWHP4500449

username:C9995799

password:**

Save Credential?(Y/N) N

[SOUMIWHP5003567:01728] Could not connect to namespace cimv2 on node 
SOUMIWHP450


0449. Error code =-2147023174

--

mpiexec was unable to start the specified application as it 
encountered an error


.

More information may be available above.

--

[SOUMIWHP5003567:01728] [[38316,0],0] ORTE_ERROR_LOG: A message is 
attempting to


be sent to a process whose contact information is unknown in file 
..\..\..\open


mpi-1.6\orte\mca\rml\oob\rml_oob_send.c at line 145

[SOUMIWHP5003567:01728] [[38316,0],0] attempted to send to 
[[38316,0],1]: tag 1


[SOUMIWHP5003567:01728] [[38316,0],0] ORTE_ERROR_LOG: A message is 
attempting to


be sent to a process whose contact information is unknown in file 
..\..\..\open


mpi-1.6\orte\orted\orted_comm.c at line 126

Could anyone tell me what I'm missing?

I've configured MPI on VS Express 2010 and I'm able to run MPI 
programs on one system.


On the other computer, I pasted the MPI_Test.exe file in the same 
location as the calling computer.


Thanks,
Vimal



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
---
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234  Nobelstrasse 19
Fax: ++49(0)711-685-65832  70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email: f...@hlrs.de