I wonder whether what I see that -np 108 and -ntomp 2 is best comes from
using -multi 6 with 8-CPU nodes. That level of parallelism may then be
necessary to trigger automatic segregation of PP and PME ranks. I'm not
sure if I tried -np 54 and -ntomp 4, which would probably also do it. I
compared mostly on 196 CPUs then found going up to 216 was better than 196
with -ntomp 2 and pure MPI (-ntomp 1) was considerably worse for both.
Would people recommend to go back to 196 which allows 4 whole nodes per
replica and playing with -npme and -ntomp_pme?
> Hi Thanh Le,
> Assuming all the nodes are the same (9 nodes with 12 CPUs) then you could
> try the following
> mpirun -np 9 --map-by node mdrun -ntomp 12 ...
> mpirun -np 18 mdrun -ntomp 6 ...
> mpirun -np 54 mdrun -ntomp 2 ...
> Which of these works best will depend on your setup.
> Using the whole cluster for one job may not be the most efficient way. I
> found on our cluster that once I reach 216 CPUs (equivalent settings from
> the queuing system to -np 108 and -ntomp 2), I can't do better by adding
> more nodes (where presumably communication becomes an issue). In addition
> to running -multi or -multidir jobs, which takes the load off
> communication a bit, it may also be worth having separate jobs and using
> -pin on and -pinoffset.
> Best wishes
>> Hi everyone,
>> I have a question concerning running gromacs in parallel. I have read
>> but I still dont quite understand how to run it efficiently.
>> My gromacs version is 4.5.4
>> The cluster I am using has CPUs total: 108 and 4 hosts up.
>> The node iam using:
>> Architecture: x86_64
>> CPU op-mode(s): 32-bit, 64-bit
>> Byte Order: Little Endian
>> CPU(s): 12
>> On-line CPU(s) list: 0-11
>> Thread(s) per core: 2
>> Core(s) per socket: 6
>> Socket(s): 1
>> NUMA node(s): 1
>> Vendor ID: AuthenticAMD
>> CPU family: 21
>> Model: 2
>> Stepping: 0
>> CPU MHz: 1400.000
>> BogoMIPS: 5200.57
>> Virtualization: AMD-V
>> L1d cache: 16K
>> L1i cache: 64K
>> L2 cache: 2048K
>> L3 cache: 6144K
>> NUMA node0 CPU(s): 0-11
>> MPI is already installed. I also have permission to use the cluster as
>> much as I can.
>> My question is: how should I write my mdrun command run to utilize all
>> possible cores and nodes?
>> Thanh Le
>> Gromacs Users mailing list
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> a mail to gmx-users-requ...@gromacs.org.
Gromacs Users mailing list
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
mail to gmx-users-requ...@gromacs.org.