On Tue, 24 Dec 2002, Daniel Escobar wrote:

> I've set up a cluster(12 nodes) with Red Hat 7.3 and Oscar 2.1.
>
> When I submit a job to the system (using qsub -l ..),
>
> qsub -l nodes=4:ppn=1:all script.sh
>
> #### output of switcher mpi --show
> user:default=mpich-1.2.4
> system:exists=true
>
> ####  output of my_stderr.txt (summary)
> *** Oops -- I cannot open the LAM help file.
> *** I tried looking for it in the following places:
> [snipped]
>
> #### output of my_stdout.txt
> [daniel@troquil daniel]$ cat my_stdout.txt
> Launchnode is nodo9.xxx.yyy.cl
> pbsdsh: task 0 exit status 215
> [snipped]

This seems to be inconsistent: switcher shows that you should have MPICH
defined as your MPI, but you are getting a LAM/MPI error message, and
you're also getting pbsdsh output.  These are three different things (at
least in OSCAR) -- I'm not quite sure how you're getting all of these to
happen.

Some questions:

1. What is the contents of your script.sh file?
2. I'm assuming that you're submitting this as a regular user (not root),
   and that that user's $HOME is under /home, and is therefore NFS mounted
   on all compute nodes.  Is this correct?
3. What is the output of "switcher mpi --show" on all nodes (as that
   user)?

-- 
{+} Jeff Squyres
{+} [EMAIL PROTECTED]
{+} http://www.lam-mpi.org/


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to