That's normally not deleted until the job is completed and the two-phase commit is done. The other reason why GRAM might delete it would be if the job expires (after it hits an end state and hasn't been touched in 4 hours). Is there a possibility of something else "cleaning" out that directory? Do those files exist?
It's possible to increase the logging level as described here: http://www.globus.org/toolkit/docs/5.2/5.2.4/gram5/admin/#idp7912160 which might give some info about what the job manager thinks is going on. Joe On Sep 18, 2013, at 3:33 PM, Markus Binsteiner <[email protected]> wrote: > Hi. > > We are experiencing a mayor problems with loosing job states, after a > while (an hour or so) every job we submit via globus ends up in an > unknown state. I'm not quite sure where to start looking, the logs say: > > ts=2013-09-18T19:20:31.006776Z id=14670 event=gram.state_file_read.end > level=ERROR gramid=/16361930530915519966/6437524403105335712/ > path=/var/lib/globus/gram_job_state/mbin029/16966e4/loadleveler/job.16361930530915519966.6437524403105335712 > msg="Error checking file status" status=-121 errno=2 reason="No such file or > directory" > > everytime another status is lost. We are using jglobus (1.8.x), > two-phase commit and we poll the LRM (LoadLeveler -- not using scheduler > event generator). > > Any idea what could cause those files to be deleted? > > Best, > Markus
