Dear SIESTA Community and Developers,

I have located a small but important bug in the verlet2 subroutine implemented in the SIESTA code, which affects its performance when running in parallel mode. The verlet2 subroutine is part of the dynamics.f file and is employed to run standard microcanonical MD simulations.

I realized of the error while performing MD simulations of the finite
temperature properties of small atomic clusters. At first, it just seemed
that some runs were finished with a Cholesky-type error (a recurrent subject in this list). But after a few days I clearly saw that the error only appears in the verlet option (microcanonical ensemble) and not in any of the other MD options (Nose, Parrinello-Rahman, anneal). Moreover, it only appears when a verlet run is restarted (no problem if the initial velocities are generated from a MB distribution) and only in parallel runs (the same simulation, in serial mode, did not stop with an error message). I could also see that the density matrix does not achieve self-consistency in the MD steps previous to the Cholesky error message (I think this has been mentioned in this list previously). Finally, the energy conservation is poor (a clear signature that something is going wrong with Verlet) when restarting an MD run in parallel mode.

As I found no solution to this problem in the e-mail archive, I decided to
track the error myself. The solution is quite simple once the error was
located; the location itself took longer. The error is in the variable
old_dt. This is used only to calculate the velocities in the first time step (the ones read from the XV file are one time step behind, so they are not the current velocities for the first time step). old_dt is read by Node zero from the VERLET_RESTART file, but then is not broadcasted to the other nodes. Broadcasting is necessary, as old_dt is used afterwards by all nodes in order to calculate the atomic velocities. To correct the error, simply change

          call broadcast(accold(1:3,1:natoms))
          call broadcast(vold(1:3,1:natoms))

by

          call broadcast(old_dt)
          call broadcast(accold(1:3,1:natoms))
          call broadcast(vold(1:3,1:natoms))

right after the VERLET_RESTART file is read. Then parallel and serial versions give the same results, there is no drift in total energy, no self-consistency problems, and, finally, no Cholesky error!

I am not sure if this will solve all the Cholesky errors reported in this list (one must be aware that there may be "physical" reasons for errors of this type, such as two atoms becoming too close), but hopefully will elliminate many of them, so it is important that you fix this error in your versions of the SIESTA code.

Hope this is useful to all of you. Best wishes,

Dr. Andres Aguado
Department of Theoretical Physics
University of Valladolid
47011 Valladolid
SPAIN

Reply via email to