Re: [gmx-users] Help with a failing test - gromacs 2019.4 - Test 42
Hi Szilard, I just found time to rerun the tests and your suggestion worked. Many thanks for this. Regards, T. On Wed, 16 Oct 2019 at 13:43, Szilárd Páll wrote: > Hi, > > The issue is an internal error triggered by the domain decomposition not > liking 14 cores in your CPU which lead to a prime rank count. > To ensure the tests pass I suggest trying to force only one device to be > used in make check, e.g. CUDA_VISIBLE_DEVICES=0 make check; alternatively > you can run the regressiontests manually. > > Cheers, > -- > Szilárd > > > On Thu, Oct 10, 2019 at 6:01 PM Raymond Arter > wrote: > > > Hi, > > > > When performing a "make check" on Gromacs 2019.4, I'm getting test 42 > > failing. > > It gives the error: > > > > Mdrun cannot use the requested (or automatic) number of ranks, > > retrying with 8 > > > > And the mdrun.out and md.log of swap_x reports: > > > > The number of ranks you selected (14) contains a large prime > factor > > 7. > > > > I've included the necessary parts of the logs below. Any help would be > > appreciated > > since I haven't come across this error before. > > > > Regards, > > > > T. > > > > > > CentOS Linux release 7.6.1810 (Core) > > CPU: Intel Xeon Gold 6132 > > Tesla V100 > > Cuda: 10.1 > > Driver: 418.40.04 > > > > Output of "make check" > > > > 42/46 Test #42: regressiontests/complex .***Failed 145.88 > sec > > > > GROMACS: gmx mdrun, version 2019.4 > > Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx > > Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) > > Working dir: > > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4 > > Command line: > > gmx mdrun -h > > > > Thanx for Using GROMACS - Have a Nice Day > > > > Mdrun cannot use the requested (or automatic) number of ranks, retrying > > with 8. > > > > Abnormal return value for ' gmx mdrun-nb cpu -notunepme >mdrun.out > > 2>&1' was 1 > > Retrying mdrun with better settings... > > Re-running orientation-restraints using CPU-based PME > > Re-running pull_geometry_angle using CPU-based PME > > Re-running pull_geometry_angle-axis using CPU-based PME > > Re-running pull_geometry_dihedral using CPU-based PME > > > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' > was > > -1 > > FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x > > > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' > was > > -1 > > FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y > > > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' > was > > -1 > > FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z > > 3 out of 55 complex tests FAILED > > > > From the following directory: > > > > > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x > > and I get the same errors for swap_y and swap_z > > > > == mdrun.out == > > > > GROMACS: gmx mdrun, version 2019.4 > > Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx > > Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) > > Working dir: > > > > > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x > > Command line: > > gmx mdrun -notunepme > > > > Reading file topol.tpr, VERSION 2019.4 (single precision) > > Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 > > > > --- > > Program: gmx mdrun, version 2019.4 > > Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) > > MPI rank:0 (out of 14) > > > > Fatal error: > > The number of ranks you selected (14) contains a large prime factor 7. In > > most > > cases this will lead to bad performance. Choose a number with smaller > prime > > factors or set the decomposition (option -dd) manually. > > > > For more information and tips for troubleshooting, please check the > GROMACS > > website at http://www.gromacs.org/Documentation/Errors > > --- > > > > == md.out == > > > > Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 > > > > Initializing Domain Decomposition on 14 ranks > > Dynamic load balancing: locked > > Minimum cell size due to atom displacement: 0.692 nm > > Initial maximum distances in bonded interactions: > > two-body bonded interactions: 0.403 nm, Exclusion, atoms 184 187 > > multi-body bonded interactions: 0.403 nm, Ryckaert-Bell., atoms 184 187 > > Minimum cell size due to bonded interactions: 0.443 nm > > Maximum distance for 3 constraints, at 120 deg. angles, all-trans: 0.459 > nm > > Estimated maximum distance required for P-LINCS: 0.459 nm > > > > --- > > Program: gmx mdrun, version 2019.4 > > Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) > > MPI rank:0 (out of 14) > > > > Fatal error: > > The number of ranks you selected (14) contains a large prime factor 7. In > > most > > cases this
Re: [gmx-users] Help with a failing test - gromacs 2019.4 - Test 42
Hi, The issue is an internal error triggered by the domain decomposition not liking 14 cores in your CPU which lead to a prime rank count. To ensure the tests pass I suggest trying to force only one device to be used in make check, e.g. CUDA_VISIBLE_DEVICES=0 make check; alternatively you can run the regressiontests manually. Cheers, -- Szilárd On Thu, Oct 10, 2019 at 6:01 PM Raymond Arter wrote: > Hi, > > When performing a "make check" on Gromacs 2019.4, I'm getting test 42 > failing. > It gives the error: > > Mdrun cannot use the requested (or automatic) number of ranks, > retrying with 8 > > And the mdrun.out and md.log of swap_x reports: > > The number of ranks you selected (14) contains a large prime factor > 7. > > I've included the necessary parts of the logs below. Any help would be > appreciated > since I haven't come across this error before. > > Regards, > > T. > > > CentOS Linux release 7.6.1810 (Core) > CPU: Intel Xeon Gold 6132 > Tesla V100 > Cuda: 10.1 > Driver: 418.40.04 > > Output of "make check" > > 42/46 Test #42: regressiontests/complex .***Failed 145.88 sec > > GROMACS: gmx mdrun, version 2019.4 > Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx > Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) > Working dir: > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4 > Command line: > gmx mdrun -h > > Thanx for Using GROMACS - Have a Nice Day > > Mdrun cannot use the requested (or automatic) number of ranks, retrying > with 8. > > Abnormal return value for ' gmx mdrun-nb cpu -notunepme >mdrun.out > 2>&1' was 1 > Retrying mdrun with better settings... > Re-running orientation-restraints using CPU-based PME > Re-running pull_geometry_angle using CPU-based PME > Re-running pull_geometry_angle-axis using CPU-based PME > Re-running pull_geometry_dihedral using CPU-based PME > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was > -1 > FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was > -1 > FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y > > Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was > -1 > FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z > 3 out of 55 complex tests FAILED > > From the following directory: > > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x > and I get the same errors for swap_y and swap_z > > == mdrun.out == > > GROMACS: gmx mdrun, version 2019.4 > Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx > Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) > Working dir: > > > /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x > Command line: > gmx mdrun -notunepme > > Reading file topol.tpr, VERSION 2019.4 (single precision) > Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 > > --- > Program: gmx mdrun, version 2019.4 > Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) > MPI rank:0 (out of 14) > > Fatal error: > The number of ranks you selected (14) contains a large prime factor 7. In > most > cases this will lead to bad performance. Choose a number with smaller prime > factors or set the decomposition (option -dd) manually. > > For more information and tips for troubleshooting, please check the GROMACS > website at http://www.gromacs.org/Documentation/Errors > --- > > == md.out == > > Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 > > Initializing Domain Decomposition on 14 ranks > Dynamic load balancing: locked > Minimum cell size due to atom displacement: 0.692 nm > Initial maximum distances in bonded interactions: > two-body bonded interactions: 0.403 nm, Exclusion, atoms 184 187 > multi-body bonded interactions: 0.403 nm, Ryckaert-Bell., atoms 184 187 > Minimum cell size due to bonded interactions: 0.443 nm > Maximum distance for 3 constraints, at 120 deg. angles, all-trans: 0.459 nm > Estimated maximum distance required for P-LINCS: 0.459 nm > > --- > Program: gmx mdrun, version 2019.4 > Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) > MPI rank:0 (out of 14) > > Fatal error: > The number of ranks you selected (14) contains a large prime factor 7. In > most > cases this will lead to bad performance. Choose a number with smaller prime > factors or set the decomposition (option -dd) manually. > > For more information and tips for troubleshooting, please check the GROMACS > website at http://www.gromacs.org/Documentation/Errors > --- > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >
[gmx-users] Help with a failing test - gromacs 2019.4 - Test 42
Hi, When performing a "make check" on Gromacs 2019.4, I'm getting test 42 failing. It gives the error: Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8 And the mdrun.out and md.log of swap_x reports: The number of ranks you selected (14) contains a large prime factor 7. I've included the necessary parts of the logs below. Any help would be appreciated since I haven't come across this error before. Regards, T. CentOS Linux release 7.6.1810 (Core) CPU: Intel Xeon Gold 6132 Tesla V100 Cuda: 10.1 Driver: 418.40.04 Output of "make check" 42/46 Test #42: regressiontests/complex .***Failed 145.88 sec GROMACS: gmx mdrun, version 2019.4 Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) Working dir: /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4 Command line: gmx mdrun -h Thanx for Using GROMACS - Have a Nice Day Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8. Abnormal return value for ' gmx mdrun-nb cpu -notunepme >mdrun.out 2>&1' was 1 Retrying mdrun with better settings... Re-running orientation-restraints using CPU-based PME Re-running pull_geometry_angle using CPU-based PME Re-running pull_geometry_angle-axis using CPU-based PME Re-running pull_geometry_dihedral using CPU-based PME Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was -1 FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was -1 FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y Abnormal return value for ' gmx mdrun -notunepme >mdrun.out 2>&1' was -1 FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z 3 out of 55 complex tests FAILED >From the following directory: /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x and I get the same errors for swap_y and swap_z == mdrun.out == GROMACS: gmx mdrun, version 2019.4 Executable: /gromacs/2019.4/gromacs-2019.4/build/bin/gmx Data prefix: /gromacs/2019.4/gromacs-2019.4 (source tree) Working dir: /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x Command line: gmx mdrun -notunepme Reading file topol.tpr, VERSION 2019.4 (single precision) Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 --- Program: gmx mdrun, version 2019.4 Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) MPI rank:0 (out of 14) Fatal error: The number of ranks you selected (14) contains a large prime factor 7. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- == md.out == Changing nstlist from 10 to 50, rlist from 1.011 to 1.137 Initializing Domain Decomposition on 14 ranks Dynamic load balancing: locked Minimum cell size due to atom displacement: 0.692 nm Initial maximum distances in bonded interactions: two-body bonded interactions: 0.403 nm, Exclusion, atoms 184 187 multi-body bonded interactions: 0.403 nm, Ryckaert-Bell., atoms 184 187 Minimum cell size due to bonded interactions: 0.443 nm Maximum distance for 3 constraints, at 120 deg. angles, all-trans: 0.459 nm Estimated maximum distance required for P-LINCS: 0.459 nm --- Program: gmx mdrun, version 2019.4 Source file: src/gromacs/domdec/domdec_setup.cpp (line 764) MPI rank:0 (out of 14) Fatal error: The number of ranks you selected (14) contains a large prime factor 7. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.