Hi Dominik, Have you double checked to see if the mpi names(lam-XXX and openmpi-XXX) for switcher exist on your client nodes? http://www.mail-archive.com/[email protected]/msg08516.html
Can you post what option did you give to rebuild lam? Of course, once your openmpi and lam are rebuilt/installed on your head node, I believe that they are also newly installed to the client nodes, right? Regards, - DongInn Dominik Schips wrote: > Hello, > > I have OSCAR 5.0 on SLES10SP1 (x86_64) but still get 2 errors at the > last step if I check the cluster. The logs are below > > The last and biggest change I made was the switch from OpenMPI 1.1.1 to > OpenMPI 1.2.5 now. After the RPM build I changed the configuration that > OSCAR can use the new package. > > The build system is also the testing system. So it isn't a clean (fresh) > SLES10SP1 and OSCAR 5.0. I think it is always a problem to get package > debendency problem and other stuff if it isn't a clean system correct. > > I have build OpenMPI from the official OpenMPI src rpm. It has tm > support. > > # ompi_info | grep tm > MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component > v1.2.5) > MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.5) > MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.5) > > About the LAM problem I read > (http://www.mail-archive.com/[email protected]/msg04656.html) > that it could be a problem if the package was build on a host without torque. > But this problem is very old and from OSCAR 4.2. I tried it with torque but > didn't help me to solve it. > > There were no errors at the lam and openmpi RPM build. > And in every log I had a look I can not find more information what can > cause this 2 problems. > > The PVM and MPICH tests are passed so it couldn't be a possible switcher > problem I think. > > For every help I would be glad. If you need more information or a > logfile just let me know. > > > > Performing root tests... > Maui service check:maui [PASSED] > TORQUE node check [PASSED] > TORQUE service check:pbs_server [PASSED] > /home mounts [PASSED] > > Preparing user tests... > Performing user tests... > SSH ping test [PASSED] > SSH server->node [PASSED] > SSH node->server [PASSED] > LAM/MPI (via TORQUE) [FAILED] > PVM (via TORQUE) [PASSED] > MPICH (via TORQUE) [PASSED] > Ganglia setup test [PASSED] > Ganglia node count test [PASSED] > TORQUE default queue definition [PASSED] > TORQUE Shell Test [PASSED] > Open MPI (via TORQUE) [FAILED] > qdel: Request invalid for state of job 106.sles10oscar > > Run APItests... > > Running Installation tests for pvm > [PASS] 2008-01-23 09:54:49 pvmd-path-ls.apt > [PASS] 2008-01-23 09:54:49 envvar-pvm_arch.apt > [PASS] 2008-01-23 09:54:49 envvar-pvm_root.apt > [PASS] 2008-01-23 09:54:49 envvar.apb > [PASS] 2008-01-23 09:54:49 pvmd-path-which.apt > [PASS] 2008-01-23 09:54:49 modulecmd-path-ls.apt > [PASS] 2008-01-23 09:54:49 pvm-module-list.apt > [PASS] 2008-01-23 09:54:49 pvm-module-show-pvm_rsh.apt > [PASS] 2008-01-23 09:54:49 pvm-module-show-pvm_arch.apt > [PASS] 2008-01-23 09:54:49 pvm-module-show-pvm_root.apt > [PASS] 2008-01-23 09:54:49 pvm-module-show.apb > [PASS] 2008-01-23 09:54:49 pvm-module.apb > [PASS] 2008-01-23 09:54:49 install_tests.apb > > There are 2 failed/skipped tests (see above). > Please check for .err and .out files in /home/oscartst/<package>. > > ...Hit <ENTER> to close this window... > > -------------------------------------------------------------------------------- > > sles10oscar:/home/oscartst/lam # cat lamtest.out > Running LAM/MPI test > sles10oscar:/home/oscartst/lam # cat lamtest.err > > ERROR: LAM/MPI does not appear to have the tm boot SSI module! > This test script will now abort. > > sles10oscar:/home/oscartst/lam # > > -------------------------------------------------------------------------------- > > sles10oscar:/home/oscartst/openmpi # cat openmpitest.err > [oscarnode2:08400] pls:tm: failed to poll for a spawned proc, return > status = 17002 > [oscarnode2:08400] [0,0,0] ORTE_ERROR_LOG: In errno in file rmgr_urm.c > at line 462 > [oscarnode2:08400] mpiexec: spawn failed with errno=-11 > sles10oscar:/home/oscartst/openmpi # cat openmpitest.out > Running Open MPI test > Open MPI appears to have TM suppport. Yippee! > > --> MPI C bindings test: > > TEST FAILED! > Commands: cp cpi.c /tmp/openmpi-test && cd /tmp/openmpi-test && mpicc > cpi.c -o openmpi-cpi && cp openmpi-cpi /home/oscartst/openmpi && > cd /home/oscartst/openmpi && mpiexec > -machinefile /var/spool/pbs/aux//106.sles10oscar -n 2 openmpi-cpi > sles10oscar:/home/oscartst/openmpi # > > > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
