This question is related to OpenMPI 2.0.1 compiled with GCC 4.8.2 on
RHEL 6.8 using Torque 6.0.2 with Moab 9.0.2. To be clear, I am an
administrator and not a coder and I suspect this is expected behavior
but I have been asked by a client to explain why this is happening.

Using Torque, the following command returns the hostname of the first
node only, regardless of how the nodes/cores are split up:

mpirun -np 20 echo "Hello from $HOSTNAME"

(the behaviour is the same with "echo $(hostname))

The Torque script looks like this:

#PBS -V
#PBS -N test-job
#PBS -l nodes=2:ppn=16
#PBS -e ERROR
#PBS -o OUTPUT


cd $PBS_O_WORKDIR
date
cat $PBS_NODEFILE

mpirun -np32 echo "Hello from $HOSTNAME"

If the echo statement is replaced with "hostname" then a proper
response is received from all nodes.

While I know there are better ways to test OpenMPI's functionality,
like compiling and using the programs in examples/, this is the method
a specific client chose. I was using both the examples and a Torque job
script calling just "hostname" as a command and not using echo and the
client was using the script above. It took some doing to figure out why
he thought it wasn't working and all my tests were successful and when
I figured it, he wanted an explanation that's beyond my current
knowledge. Any help towards explaining the behaviour would be greatly
appreciated.

-- 
Regards,

Mark L. Potter
Senior Consultant
PCPC Direct, Ltd.
O: 713-344-0952 
M: 713-965-4133
S: mpot...@pcpcdirect.com
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to