We're using the job completion script to insert tracking records into a database for the jobs run on our machine, every once in a while i see an error with the start time listed for cancelled jobs
Here's some examples: jobid=11721 jobstate=cancelled submit=1300487648 start=1331416200 end=1300487677 jobid=11722 jobstate=cancelled submit=1300550901 start=1332074346 end=1300550922 jobid=11723 jobstate=cancelled submit=1300582908 start=1332112169 end=1300582920 I haven't been able to figure out what transpires to reproduce this behavior though. We have a longish slurmctld-prolog script, I think someone is srun'ing the job and cancelling it before/during the prolog. The jobs don't show an assigned node in the output so perhaps our users are srun'ing and canceling the job with rather speedy fingers, before slurm has a chance to do anything. Can anyone confirm this as a bug?
