Communication on most instances will use some form of Ethernet, so unless carefully setup will have a rather high latency - thread-mpi is quite well optimized for a single node.  Perhaps check performance on a single V100 GPU coupled with an decent CPU and compare that to p3.dn24xlarge to determine best place to run. Usual GROMACS pipleline involves several steps, so just measuring one part may not be reflective of typical workflow.

On 3/9/19 12:40 AM, Carlos Rivas wrote:
Benson,
When I was testing on a single machine, performance was moving by leaps and 
bounds, like this:

-- 2 hours on a c5.2xlarge
-- 68 minutes on a p2.xlarge
-- 18 minutes on a p3.2xlarge
-- 7 minutes on a p3.dn24xlarge

It's only when I switched to using clusters that things went downhill and I 
haven't been able to beat the above numbers by throwing more CPUs and GPUs at 
it.

CJ


-----Original Message-----
From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se 
<gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Benson Muite
Sent: Friday, March 8, 2019 4:19 PM
To: gromacs.org_gmx-users@maillist.sys.kth.se
Subject: Re: [gmx-users] gromacs performance

You seem to be using a relatively large number of GPUs. May want to check your 
input data (many cases will not scale well, but ensemble runs can be quite 
common). Perhaps check speedup in going from 1 to 2 to 4 GPUs on one node.

On 3/9/19 12:11 AM, Carlos Rivas wrote:
Hey guys,
Anybody running GROMACS on AWS ?

I have a strong IT background , but zero understanding of GROMACS or
OpenMPI. ( even less using sge on AWS ), Just trying to help some PHD Folks 
with their work.

When I run gromacs using Thread-mpi on a single, very large node on AWS things 
work fairly fast.
However, when I switch from thread-mpi to OpenMPI even though everything's 
detected properly, the performance is horrible.

This is what I am submitting to sge:

ubuntu@ip-10-10-5-81:/shared/charmm-gui/gromacs$ cat sge.sh
#!/bin/bash # #$ -cwd #$ -j y #$ -S /bin/bash #$ -e out.err #$ -o
out.out #$ -pe mpi 256

cd /shared/charmm-gui/gromacs
touch start.txt
/bin/bash /shared/charmm-gui/gromacs/run_eq.bash
touch end.txt

and this is my test script , provided by one of the Doctors:

ubuntu@ip-10-10-5-81:/shared/charmm-gui/gromacs$ cat run_eq.bash
#!/bin/bash export GMXMPI="/usr/bin/mpirun --mca btl ^openib
/shared/gromacs/5.1.5/bin/gmx_mpi"

export MDRUN="mdrun -ntomp 2 -npme 32"

export GMX="/shared/gromacs/5.1.5/bin/gmx_mpi"

for comm in min eq; do
if [ $comm == min ]; then
     echo ${comm}
     $GMX grompp -f step6.0_minimization.mdp -o step6.0_minimization.tpr -c 
step5_charmm2gmx.pdb -p topol.top
     $GMXMPI $MDRUN -deffnm step6.0_minimization

fi

if [ $comm == eq ]; then
    for step in `seq 1 6`;do
     echo $step
     if [ $step -eq 1 ]; then
        echo ${step}
        $GMX grompp -f step6.${step}_equilibration.mdp -o 
step6.${step}_equilibration.tpr -c step6.0_minimization.gro -r 
step5_charmm2gmx.pdb -n index.ndx -p topol.top
        $GMXMPI $MDRUN -deffnm step6.${step}_equilibration
     fi
     if [ $step -gt 1 ]; then
        old=`expr $step - 1`
        echo $old
        $GMX grompp -f step6.${step}_equilibration.mdp -o 
step6.${step}_equilibration.tpr -c step6.${old}_equilibration.gro -r 
step5_charmm2gmx.pdb -n index.ndx -p topol.top
        $GMXMPI $MDRUN -deffnm step6.${step}_equilibration
     fi
    done
fi
done




during the output, I see this , and I get really excited, expecting blazing 
speeds and yet, it's much worse than a single node:

Command line:
    gmx_mpi mdrun -ntomp 2 -npme 32 -deffnm step6.0_minimization


Back Off! I just backed up step6.0_minimization.log to
./#step6.0_minimization.log.6#

Running on 4 nodes with total 128 cores, 256 logical cores, 32 compatible GPUs
    Cores per node:           32
    Logical cores per node:   64
    Compatible GPUs per node:  8
    All nodes have identical type(s) of GPUs Hardware detected on host
ip-10-10-5-89 (the node of MPI rank 0):
    CPU info:
      Vendor: GenuineIntel
      Brand:  Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
      SIMD instructions most likely to fit this hardware: AVX2_256
      SIMD instructions selected at GROMACS compile time: AVX2_256
    GPU info:
      Number of GPUs detected: 8
      #0: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #1: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #2: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #3: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #4: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #5: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #6: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
      #7: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes,
stat: compatible

Reading file step6.0_minimization.tpr, VERSION 5.1.5 (single
precision) Using 256 MPI processes Using 2 OpenMP threads per MPI
process

On host ip-10-10-5-89 8 compatible GPUs are present, with IDs
0,1,2,3,4,5,6,7 On host ip-10-10-5-89 8 GPUs auto-selected for this run.
Mapping of GPU IDs to the 56 PP ranks in this node:
0,0,0,0,0,0,0,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,6,6,6,6,6,6,6,7,7,7,7,7,7,7



Any suggestions? Greatly appreciate the help.


Carlos J. Rivas
Senior AWS Solutions Architect - Migration Specialist

--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Reply via email to