Dear Andres: Thank you so much for such a good contribution!
We will immediately try and test your bug-fix and probably incorporate it to the next SIESTA distribution. Again Gracias! Yours, The SIESTA Team -- Jose A. Torres, Ph.D. Manager of the SIESTA Software On Thu, Jun 11, 2009 at 5:11 PM, Andres Aguado<agu...@metodos.fam.cie.uva.es> wrote: > Dear SIESTA Community and Developers, > > I have located a small but important bug in the verlet2 subroutine > implemented in the SIESTA code, which affects its performance when running > in parallel mode. The verlet2 subroutine is part of the dynamics.f file and > is employed to run standard microcanonical MD simulations. > > I realized of the error while performing MD simulations of the finite > temperature properties of small atomic clusters. At first, it just seemed > that some runs were finished with a Cholesky-type error (a recurrent subject > in this list). But after a few days I clearly saw that the error only > appears in the verlet option (microcanonical ensemble) and not in any of the > other MD options (Nose, Parrinello-Rahman, anneal). Moreover, it only > appears when a verlet run is restarted (no problem if the initial velocities > are generated from a MB distribution) and only in parallel runs (the same > simulation, in serial mode, did not stop with an error message). I could > also see that the density matrix does not achieve self-consistency in the MD > steps previous to the Cholesky error message (I think this has been > mentioned in this list previously). Finally, the energy conservation is poor > (a clear signature that something is going wrong with Verlet) when > restarting an MD run in parallel mode. > > As I found no solution to this problem in the e-mail archive, I decided to > track the error myself. The solution is quite simple once the error was > located; the location itself took longer. The error is in the variable > old_dt. This is used only to calculate the velocities in the first time step > (the ones read from the XV file are one time step behind, so they are not > the current velocities for the first time step). old_dt is read by Node zero > from the VERLET_RESTART file, but then is not broadcasted to the other > nodes. Broadcasting is necessary, as old_dt is used afterwards by all nodes > in order to calculate the atomic velocities. To correct the error, simply > change > > call broadcast(accold(1:3,1:natoms)) > call broadcast(vold(1:3,1:natoms)) > > by > > call broadcast(old_dt) > call broadcast(accold(1:3,1:natoms)) > call broadcast(vold(1:3,1:natoms)) > > right after the VERLET_RESTART file is read. Then parallel and serial > versions give the same results, there is no drift in total energy, no > self-consistency problems, and, finally, no Cholesky error! > > I am not sure if this will solve all the Cholesky errors reported in this > list (one must be aware that there may be "physical" reasons for errors of > this type, such as two atoms becoming too close), but hopefully will > elliminate many of them, so it is important that you fix this error in your > versions of the SIESTA code. > > Hope this is useful to all of you. Best wishes, > > Dr. Andres Aguado > Department of Theoretical Physics > University of Valladolid > 47011 Valladolid > SPAIN >