Anna Marabotti wrote:
Dear all,
we installed GROMACS v. 3.3.2 on a cluster formed by 20 biprocessor nodes with 
Centos 4 x86_64, following instructions on the
GROMACS web site. We compiled it in single precision, parallel version using 
the --enable-mpi option (LAM MPI was already present on
the cluster). After the final installation, I made calculations and found that 
the parallel scalability of the software is very bad:
in fact, when I use 3 or more processors, the performance is the same (or even 
worse) than using a single processor.
As benchmarks, I used two systems: the spider toxin peptide (GROMACS tutorial) 
in a cubic box of 0.7 nm filled with 2948 water
molecules (model: spc216) (Total: ~3200 atoms), and a dimeric protein of 718 aa 
in a cubic box (box edge 0.7 nm) filled with 18425
water molecules (model: spc216) and 6 Na+ ions (Total: ~62700 atoms). In both 
cases I set the Gromos 96 FF (G43a1). I used the
following commands:

There's a parallel benchmark system available from the GROMACS web page. I'd recommend using it, so that we can be confident there are not problems with the simulation system.

grompp_mpi -np "nofprocessors" -f pr.mdp -c systemmini.gro -o systempr.tpr -p 
system.top
mpirun -np "nofprocessors" mdrun_mpi -s systempr.tpr -o systempr.trr -c 
systempr.gro -e systempr.edr -g systempr.log

See man grompp about the use of -shuffle, which might help.

I paste here the results of the performances of position-restrained MD:

spider toxin peptide:

np      time            Mnbf/s  GFlops  ns/day  hours/ns
1       1m 51s 237      10.797  2.425   7.806   3.074
2       0m 59s 614      19.561  4.402   14.664  1.694
3       2m 11s 622      9.261   2.082   6.698   3.583
4       1m 58s 722      10.283  2.315   7.448   3.222
5       1m 40s 580      11.813  2.659   8.554   2.806
6       1m 50s 830      10.859  2.442   7.855   3.056
7       1m 49s 232      10.878  2.442   7.855   3.056
8       2m 2s 292       6.190   1.392   4.477   5.361
9       2m 5s 778       9.533   2.150   6.912   3.472
10 2m 22s 540 8.349 1.879 6.042 3.972

These are probably too short to get an idea of scaling, because they could still be dominated by setup time. I would simulate for at least about 10 minutes to get a good idea. However, merely increasing your time won't fix the problem introduced by the absence of -shuffle and/or -sort.

You will also see in the bottom of the .log file that you can get an idea how (un)balanced the work is across the processes.

dimeric protein:

np      time            Mnbf/s  GFlops  ns/day  hours/ns
1       13m 51s 391     14.041  2.179   1.043   23.016
2       7m 45s 321      25.011  3.883   1.858   12.917
3       15m 26s 621     12.586  1.954   0.935   25.667
4       15m 15s 3       12.749  1.978   0.946   25.361
5       12m 35s 274     15.481  2.401   1.149   20.889
6       13m 23s 750     14.517  2.255   1.079   22.250
7       12m 25s 659     15.669  2.435   1.164   20.611
8       13m 0s 434      14.977  2.325   1.112   21.583
9       12m 12s 601     15.949  2.475   1.184   20.278
10      13m 1s 724      14.962  2.322   1.111   21.611

I saw in the GROMACS mailing list that it could be due to a problem of 
communication between nodes, but it seems to me that nobody
obtained so bad results before. Has anybody some suggestions - apart waiting 
for GROMACS 4.0 version ;-) - about some further checks
to do on the system or different compilation/installation to try?

See above.

Mark
_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to