Thanks Brian!! I think that it works now. So I suppose that I should eave that lamboot commands inside every script I run and not worry about lamboot outside the pbs environment at all!!
I do have a few other questions that I hope you can answer. I have two slave nodes and four processors. PBS recognized all of them at setup. Now using the mpirun -np 2 option, i see that the process runs on two processors of the same node. How do I make the process run on both the slave nodes and on all four processors? I tried a variety of options but none of them seem to work. I am pretty sure that it is fairly easy but I don't seem to get it. All I need to do is control the number of processors the job runs on. Thanks. Senthil "Brian W. Barrett" wrote: > > This is an easy one (although probably not obvious until you stop and > think about it). When using PBS, you are never supposed to do anything > "outside of PBS" - you should use PBS for everything, including starting > up the LAM runtime environment (running lamboot). This is actually pretty > easy - in your case, all you need to do is have a PBS script that looks > something like this: > > ---- > #PBS -S /bin/sh > #PBS -l nodes=1 > #PBS -q normal > #PBS -N tielema > #PBS -j oe > echo "I ran on `hostname`" > > # bring up LAM environment on allocated nodes > # (which are listed in the filename contained in $PBS_NODEFILE) > lamboot $PBS_NODEFILE > > # use mpirun to run my MPI binary with all allocated nodes > mpirun C ls > > # shut down LAM environment > lamhalt > --- > > You were getting the "no lamd running" error message because LAM tries to > be smart under PBS. If setup a certain way, PBS can allocate two seperate > jobs to the same node. If LAM wasn't smart, the two jobs would collide - > so we have to play some tricks to avoid colliding - hence the error > message that you saw. Since your lamboot was not under the same PBS job, > it was completely invisible to the mpirun running under PBS. > > > Hope to get this "computer" problem fixed ASAP so that I can worry about > > "science" > > That's what we're here for. Hopefully, this should be the end of it:). > > Brian > _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
