Re: [OMPI users] MPI Behaviour Question

2016-10-12 Thread Mark Potter
After the responses I did more testing. Even $(hostname) and `hostname`
get expanded on the first node. A script using echo (then any of them
from the environment variable to the backticks works. I'm guessing all
shell expansion on the CLI happens on the first node, from my limited
testing. That explanation makes sense and fits the results. It's easy
enough to explain as well!

On Tue, 2016-10-11 at 22:17 +0900, Gilles Gouaillardet wrote:
> Mark,
> 
> My understanding is that shell meta expansion occurs once on the
> first node, so from an Open MPI point of view, you really invoke
> mpirun echo node0
> I suspect
> mpirun echo 'Hello from $(hostname)'
> Is what you want to do
> I do not know about
> mpirun echo 'Hello from $HOSTNAME'
> $HOSTNAME might be passed by the first node to all tasks, and hence
> might not have the value you expect on all the nodes
> Feel free to
> mpirun env | grep ^HOSTNAME=
> To check if the HOSTNAME variable is set to what you expect
> 
> /* i an afk, so i cannot check that right now ... */
> 
> 
> Cheers,
> 
> Gilles
> 
> Mark Potter  wrote:
> > 
> > This question is related to OpenMPI 2.0.1 compiled with GCC 4.8.2
> > on
> > RHEL 6.8 using Torque 6.0.2 with Moab 9.0.2. To be clear, I am an
> > administrator and not a coder and I suspect this is expected
> > behavior
> > but I have been asked by a client to explain why this is happening.
> > 
> > Using Torque, the following command returns the hostname of the
> > first
> > node only, regardless of how the nodes/cores are split up:
> > 
> > mpirun -np 20 echo "Hello from $HOSTNAME"
> > 
> > (the behaviour is the same with "echo $(hostname))
> > 
> > The Torque script looks like this:
> > 
> > #PBS -V
> > #PBS -N test-job
> > #PBS -l nodes=2:ppn=16
> > #PBS -e ERROR
> > #PBS -o OUTPUT
> > 
> > 
> > cd $PBS_O_WORKDIR
> > date
> > cat $PBS_NODEFILE
> > 
> > mpirun -np32 echo "Hello from $HOSTNAME"
> > 
> > If the echo statement is replaced with "hostname" then a proper
> > response is received from all nodes.
> > 
> > While I know there are better ways to test OpenMPI's functionality,
> > like compiling and using the programs in examples/, this is the
> > method
> > a specific client chose. I was using both the examples and a Torque
> > job
> > script calling just "hostname" as a command and not using echo and
> > the
> > client was using the script above. It took some doing to figure out
> > why
> > he thought it wasn't working and all my tests were successful and
> > when
> > I figured it, he wanted an explanation that's beyond my current
> > knowledge. Any help towards explaining the behaviour would be
> > greatly
> > appreciated.
> > 
-- 
Regards,

Mark L. Potter
Senior Consultant
PCPC Direct, Ltd.
O: 713-344-0952 
M: 713-965-4133
S: mpot...@pcpcdirect.com
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI Behaviour Question

2016-10-11 Thread Reuti
Hi,

> Am 11.10.2016 um 14:56 schrieb Mark Potter :
> 
> This question is related to OpenMPI 2.0.1 compiled with GCC 4.8.2 on
> RHEL 6.8 using Torque 6.0.2 with Moab 9.0.2. To be clear, I am an
> administrator and not a coder and I suspect this is expected behavior
> but I have been asked by a client to explain why this is happening.
> 
> Using Torque, the following command returns the hostname of the first
> node only, regardless of how the nodes/cores are split up:
> 
> mpirun -np 20 echo "Hello from $HOSTNAME"

The $HOSTNAME will be expanded and used as argument before `mpirun` even 
starts. Instead it has to be evaluated on the nodes:

$ mpirun bash -c "echo \$HOSTNAME"


> (the behaviour is the same with "echo $(hostname))
> 
> The Torque script looks like this:
> 
> #PBS -V
> #PBS -N test-job
> #PBS -l nodes=2:ppn=16
> #PBS -e ERROR
> #PBS -o OUTPUT
> 
> 
> cd $PBS_O_WORKDIR
> date
> cat $PBS_NODEFILE
> 
> mpirun -np32 echo "Hello from $HOSTNAME"
> 
> If the echo statement is replaced with "hostname" then a proper
> response is received from all nodes.
> 
> While I know there are better ways to test OpenMPI's functionality,
> like compiling and using the programs in examples/, this is the method
> a specific client chose.

There are small "Hello world" programs like here:

http://mpitutorial.com/tutorials/mpi-hello-world/

to test whether e.g. the libraries are found at runtime by the application(s).

-- Reuti


> I was using both the examples and a Torque job
> script calling just "hostname" as a command and not using echo and the
> client was using the script above. It took some doing to figure out why
> he thought it wasn't working and all my tests were successful and when
> I figured it, he wanted an explanation that's beyond my current
> knowledge. Any help towards explaining the behaviour would be greatly
> appreciated.
> 
> -- 
> Regards,
> 
> Mark L. Potter
> Senior Consultant
> PCPC Direct, Ltd.
> O: 713-344-0952 
> M: 713-965-4133
> S: mpot...@pcpcdirect.com
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] MPI Behaviour Question

2016-10-11 Thread Gilles Gouaillardet
Mark,

My understanding is that shell meta expansion occurs once on the first node, so 
from an Open MPI point of view, you really invoke
mpirun echo node0
I suspect
mpirun echo 'Hello from $(hostname)'
Is what you want to do
I do not know about
mpirun echo 'Hello from $HOSTNAME'
$HOSTNAME might be passed by the first node to all tasks, and hence might not 
have the value you expect on all the nodes
Feel free to
mpirun env | grep ^HOSTNAME=
To check if the HOSTNAME variable is set to what you expect

/* i an afk, so i cannot check that right now ... */


Cheers,

Gilles

Mark Potter  wrote:
>This question is related to OpenMPI 2.0.1 compiled with GCC 4.8.2 on
>RHEL 6.8 using Torque 6.0.2 with Moab 9.0.2. To be clear, I am an
>administrator and not a coder and I suspect this is expected behavior
>but I have been asked by a client to explain why this is happening.
>
>Using Torque, the following command returns the hostname of the first
>node only, regardless of how the nodes/cores are split up:
>
>mpirun -np 20 echo "Hello from $HOSTNAME"
>
>(the behaviour is the same with "echo $(hostname))
>
>The Torque script looks like this:
>
>#PBS -V
>#PBS -N test-job
>#PBS -l nodes=2:ppn=16
>#PBS -e ERROR
>#PBS -o OUTPUT
>
>
>cd $PBS_O_WORKDIR
>date
>cat $PBS_NODEFILE
>
>mpirun -np32 echo "Hello from $HOSTNAME"
>
>If the echo statement is replaced with "hostname" then a proper
>response is received from all nodes.
>
>While I know there are better ways to test OpenMPI's functionality,
>like compiling and using the programs in examples/, this is the method
>a specific client chose. I was using both the examples and a Torque job
>script calling just "hostname" as a command and not using echo and the
>client was using the script above. It took some doing to figure out why
>he thought it wasn't working and all my tests were successful and when
>I figured it, he wanted an explanation that's beyond my current
>knowledge. Any help towards explaining the behaviour would be greatly
>appreciated.
>
>-- 
>Regards,
>
>Mark L. Potter
>Senior Consultant
>PCPC Direct, Ltd.
>O: 713-344-0952 
>M: 713-965-4133
>S: mpot...@pcpcdirect.com
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users