gianluca santarossa wrote:

I think you are right. I guess this is the best solution, after all.
The drawback is that I need to do a pilot run for each system I need to simulate. And the number of processors I choose, too. O, no! It smells like a benchmark!!!! :P

Well in the real world, your simulations are likely to be sufficiently similar that you can keep a spreadsheet of calculation rates as a function of the number of particles and interpolate readily...

That's a reasonable start, but the nature of buffered output is such that you can't guarantee that ener.edr and traj.trr are at the same point. What you need to do is get gromacs to exit gracefully having flushed its buffers. My PBS setup sends a SIGHUP that GROMACS 3.3.1 reads and does an appropriate end-of-last-step flush and a pirouette to finish :-) I suggest passing the SIGHUP, delaying as long as you can afford and only then copying the files back. This will work better on average. It's probably overkill if you implement the first solution.

I don't know how to do that... Can you help me? (At least, I can learn something new about scripting...) If I'm right, trap is executed after its command finishes. So I cannot send a SIGHUP signal from the trap.

I was theorizing that sending a SIGHUP would be possible... you need to catch the signal to send your output back, but you need to send one to the child process to get the buffers flushed. Actually man mdrun suggests that gromacs doesn't listen for SIGHUP at all, so ignore me.

On the other side, I have no rights on the signals from the queue. From the FAQ of the cluster: "To give the application a chance to exit gracefully, LSF first sends a “friendly” signal (SIGUSR2) to all processes of a job when its time limit is about to expire. If the job is still running after a short grace period, LSF sends increasingly “unfriendly” signals (SIGINT, SIGTERM and SIGKILL). The last
one effectively kills the job."

man mdrun suggests that unless you were able to copy output back after the SIGTERM, you'd struggle.

A plan B would be to intercept the SIGUSR2 with your script and send a SIGTERM to the simulation... if that's possible.

Mark
_______________________________________________
gmx-users mailing list    [email protected]
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to