Hi , all of test was successful on my cluster network , except openmpi one ,
by the way , i tried to run lamboot hosts.txt on server node but these
errors showed up:
LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University
ERROR: LAM/MPI unexpectedly received the following on stderr:
bash: hboot: command not found
-----------------------------------------------------------------------------
LAM failed to execute a LAM binary on the remote node "hpc-linux01".
Since LAM was already able to determine your remote shell as "hboot",
it is probable that this is not an authentication problem.
*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.
LAM tried to use the remote agent command "/usr/bin/ssh"
to invoke the following command:
/usr/bin/ssh hpc-linux01 -n hboot -t -c lam-conf.lamd -s -I '"-H
10.0.0.1 -P 33280 -n 1 -o 0"'
This can indicate several things. You should check the following:
- The LAM binaries are in your $PATH
- You can run the LAM binaries
- The $PATH variable is set properly before your
.cshrc/.profile exits
Try to invoke the command listed above manually at a Unix prompt.
You will need to configure your local setup such that you will *not*
be prompted for a password to invoke this command on the remote node.
No output should be printed from the remote node before the output of
the command is displayed.
When you can get this command to execute successfully by hand, LAM
will probably be able to function properly.
-----------------------------------------------------------------------------
ERROR: LAM/MPI unexpectedly received the following on stderr:
bash: tkill: command not found
-----------------------------------------------------------------------------
LAM failed to execute a LAM binary on the remote node "hpc-linux01".
Since LAM was already able to determine your remote shell as "tkill",
it is probable that this is not an authentication problem.
*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.
LAM tried to use the remote agent command "/usr/bin/ssh"
to invoke the following command:
/usr/bin/ssh hpc-linux01 -n tkill
This can indicate several things. You should check the following:
- The LAM binaries are in your $PATH
- You can run the LAM binaries
- The $PATH variable is set properly before your
.cshrc/.profile exits
Try to invoke the command listed above manually at a Unix prompt.
You will need to configure your local setup such that you will *not*
be prompted for a password to invoke this command on the remote node.
No output should be printed from the remote node before the output of
the command is displayed.
When you can get this command to execute successfully by hand, LAM
will probably be able to function properly.
What is the problem and what should i do?!
--
A.Nazemian
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you. Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users