Can you please file an issue on redmine.gromacs.org and attach the inputs
that reproduce the behavior described?
--
Szilárd
On Wed, Dec 4, 2019, 21:35 Chenou Zhang wrote:
> We did test that.
> Our cluster has total 11 GPU nodes and I ran 20 tests over all of them. 7
> out of the 20 tests did
We did test that.
Our cluster has total 11 GPU nodes and I ran 20 tests over all of them. 7
out of the 20 tests did have the potential energy jump issue and they were
running on 5 different nodes.
So I tend to believe this issue happens on any of those nodes.
On Wed, Dec 4, 2019 at 1:14 PM
The fact that you are observing errors alo the energies to be off by so
much and that it reproduces with multiple inputs suggest that this may not
a code issue. Did you do all runs that failed on the same hardware? Have
you excluded the option that one of those GeForce cards may be flaky?
--
We tried the same gmx settings in 2019.4 with different protein systems.
And we got the same weird potential energy jump within 1000 steps.
```
Step Time
00.0
Energies (kJ/mol)
BondU-BProper Dih. Improper Dih. CMAP Dih.
Hi,
I've run 30 tests with the -notunepme option. I got the following error
from one of them(which is still the same *cudaStreamSynchronize failed*
error):
```
DD step 1422999 vol min/aver 0.639 load imb.: force 1.1% pme
mesh/force 1.079
Step Time
1423000
For the error:
```
^Mstep 4400: timed with pme grid 96 96 60, coulomb cutoff 1.446: 467.9
M-cycles
^Mstep 4600: timed with pme grid 96 96 64, coulomb cutoff 1.372: 451.4
M-cycles
/var/spool/slurmd/job2321134/slurm_script: line 44: 29866 Segmentation
fault gmx mdrun -v -s $TPR -deffnm
Hi,
What driver version is reported in the respective log files? Does the error
persist if mdrun -notunepme is used?
Mark
On Mon., 2 Dec. 2019, 21:18 Chenou Zhang, wrote:
> Hi Gromacs developers,
>
> I'm currently running gromacs 2019.4 on our university's HPC cluster. To
> fully utilize the