Hello! I am new to Gromacs and especially at parallel runs. I have some problem running my system using domain decomposition. I apologize but it will be a long mail...
My system is made up by a membrane protein embedded is a mixed lipid bilayer (POPE/POPG). I tried to run it on 8 nodes but the simulation crashed due to "A charge group moved too far between two domain decomposition steps" and high % of load imbalance between nodes. I then tested the same run on a single node and it worked. I tried then different amount of nodes, changing DD (using mdrun -dd option) and it seems that more node I am using the less the performance in terms of ns/day although the load imbalance % is highly variable. During this test I found that the optimal nodes for my system is running it on 6 node with a DD 3:2:1 (vol 0.80 imb F 1% ) ---- from the log file: Initializing Domain Decomposition on 6 nodes Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.703 nm, LJ-14, atoms 21579 21587 multi-body bonded interactions: 2.038 nm, Proper Dih., atoms 20434 20405 Minimum cell size due to bonded interactions: 2.241 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.820 nm Estimated maximum distance required for P-LINCS: 0.820 nm Domain decomposition grid 3 x 2 x 1, separate PME nodes 0 Domain decomposition nodeid 0, coordinates 0 0 0 Table routines are used for coulomb: TRUE Table routines are used for vdw: FALSE Will do PME sum in reciprocal space. ... Making 2D domain decomposition grid 3 x 2 x 1, home cell index 0 0 0 Center of mass motion removal mode is Linear We have the following groups for center of mass motion removal: 0: Protein_POPE_POPG 1: SOL_NA+_CL- There are: 82711 Atoms Charge group distribution at step 2000000: 5512 5533 5668 5694 5535 5626 Grid: 7 x 10 x 14 cells Initial temperature: 309.243 K -------- Using this setting I finally managed to equilibrate my system by going trough a series of restrained runs. Surprisingly after 6,5 ns of non-restrained run (Step 3289500) the simulation crashes with : "Fatal error: A charge group moved too far between two domain decomposition steps This usually means that your system is not well equilibrated ". It seems strange that it crashes only at 3289500 steps of a non-restrained run. I am now running a short run starting from a short while before the crash step using a single processor and, as suspected, it is going smoothly. My guess is that something is going wrong with the domain decomposition of a such non-homogeneous system, considering that there are also charged lipids that complicate it but I have no idea how to solve/improve it. I am using gromacs-4.0.5. Any suggestion are welcome. -- Irene Farabella Wellcome Trust PhD student Institute of Structural and Molecular Biology Department of of Biological Sciences Birkbeck University of London Malet Street London WC1E 7HX Telephone +44 (0)20 7631 6815 -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php