Hello DongInn

Am Mittwoch, den 23.01.2008, 09:39 -0500 schrieb DongInn Kim:
> Hi Dominik,
> 
> Have you double checked to see if the mpi names(lam-XXX and openmpi-XXX) for 
> switcher exist on your client nodes?
> http://www.mail-archive.com/[email protected]/msg08516.html

Yes, the modules are on the client.

oscarnode1:/opt/env-switcher/share/env-switcher/mpi # ls -l
insgesamt 12
-rw-r--r-- 1 root root 2532 2008-01-23 13:39 lam-7.1.2
-rw-r--r-- 1 root root  592 2008-01-23 13:39 mpich-ch_p4-gcc-1.2.7
-rw-r--r-- 1 root root 2295 2008-01-23 13:39 openmpi-1.2.5

oscarnode2:/opt/env-switcher/share/env-switcher/mpi # ls -l
insgesamt 12
-rw-r--r-- 1 root root 2532 2008-01-23 13:39 lam-7.1.2
-rw-r--r-- 1 root root  592 2008-01-23 13:39 mpich-ch_p4-gcc-1.2.7
-rw-r--r-- 1 root root 2295 2008-01-23 13:39 openmpi-1.2.5

oscarnode1:/opt # switcher mpi --list
openmpi-1.2.5
lam-7.1.2
mpich-ch_p4-gcc-1.2.7
oscarnode1:/opt #


> Can you post what option did you give to rebuild lam?

First time with rpmbuild -ba --define "config_options CC=gcc4 CXX=g++4
FC=gfortran"

That was wrong I because the tm support wasn't build in. I had a closer
look at the spec and found the line with the compile option for tm
support.

Now I used rpmbuild -ba --define "config_options CC=gcc CXX=g++
FC=gfortran --with-tm=/opt/pbs" lam-7.1.2-oscar.spec for a test and the
LAM error is gone.

But thank you for your help about the command line. I'm a little bit
busy to do development and set up a new cluster and do some other
clustertests at once today.

> Of course, once your openmpi and lam are rebuilt/installed on your head node, 
> I believe that they are also newly installed to the client nodes, right?

Yes, I always did a complete ./install_cluster eth1 with new image for
the clients after I cleaned up some stuff (/opt/oscar/tmp,
mysql, /tftpboot, ...).

Now the LAM error is gone but the OpenMPI error is still there. Any idea
about this "pls:tm: failed to poll for a spawned proc ..." error?


Performing root tests...
Maui service check:maui                                        [PASSED]
TORQUE node check                                              [PASSED]
TORQUE service check:pbs_server                                [PASSED]
/home mounts                                                   [PASSED]

Preparing user tests...
Performing user tests...
SSH ping test                                                  [PASSED]
SSH server->node                                               [PASSED]
SSH node->server                                               [PASSED]
LAM/MPI (via TORQUE)                                           [PASSED]
PVM (via TORQUE)                                               [PASSED]
MPICH (via TORQUE)                                             [PASSED]
Ganglia setup test                                             [PASSED]
Ganglia node count test                                        [PASSED]
TORQUE default queue definition                                [PASSED]
TORQUE Shell Test                                              [PASSED]
Open MPI (via TORQUE)                                          [FAILED]

Run APItests...

Running Installation tests for pvm
[PASS]       2008-01-23 17:11:37   pvmd-path-ls.apt
[PASS]       2008-01-23 17:11:37   envvar-pvm_arch.apt
[PASS]       2008-01-23 17:11:37   envvar-pvm_root.apt
[PASS]       2008-01-23 17:11:37   envvar.apb
[PASS]       2008-01-23 17:11:37   pvmd-path-which.apt
[PASS]       2008-01-23 17:11:37   modulecmd-path-ls.apt
[PASS]       2008-01-23 17:11:37   pvm-module-list.apt
[PASS]       2008-01-23 17:11:37   pvm-module-show-pvm_rsh.apt
[PASS]       2008-01-23 17:11:37   pvm-module-show-pvm_arch.apt
[PASS]       2008-01-23 17:11:37   pvm-module-show-pvm_root.apt
[PASS]       2008-01-23 17:11:37   pvm-module-show.apb
[PASS]       2008-01-23 17:11:37   pvm-module.apb
[PASS]       2008-01-23 17:11:37   install_tests.apb

There is 1 failed/skipped test (see above).
Please check for .err and .out files in /home/oscartst/<package>.

...Hit <ENTER> to close this window...


----------------------------------------------------------------------



sles10oscar:/home/oscartst/openmpi # cat openmpitest.err
[oscarnode2:04413] pls:tm: failed to poll for a spawned proc, return
status = 17002
[oscarnode2:04413] [0,0,0] ORTE_ERROR_LOG: In errno in file rmgr_urm.c
at line 462
[oscarnode2:04413] mpiexec: spawn failed with errno=-11
sles10oscar:/home/oscartst/openmpi # cat openmpitest.out
Running Open MPI test
Open MPI appears to have TM suppport.  Yippee!

--> MPI C bindings test:

TEST FAILED!
Commands: cp cpi.c /tmp/openmpi-test && cd /tmp/openmpi-test && mpicc
cpi.c -o openmpi-cpi && cp openmpi-cpi /home/oscartst/openmpi &&
cd /home/oscartst/openmpi && mpiexec
-machinefile /var/spool/pbs/aux//131.sles10oscar -n 2 openmpi-cpi




-- 
Mit freundlichen Grüßen / Best regards

Dominik Schips

Tel.: +49 (0)21 61 - 46 43-112
Fax:  +49 (0)21 61 - 46 43-100

credativ GmbH, HRB Mönchengladbach 12080
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to