Hi. We are experiencing a mayor problems with loosing job states, after a while (an hour or so) every job we submit via globus ends up in an unknown state. I'm not quite sure where to start looking, the logs say:
ts=2013-09-18T19:20:31.006776Z id=14670 event=gram.state_file_read.end level=ERROR gramid=/16361930530915519966/6437524403105335712/ path=/var/lib/globus/gram_job_state/mbin029/16966e4/loadleveler/job.16361930530915519966.6437524403105335712 msg="Error checking file status" status=-121 errno=2 reason="No such file or directory" everytime another status is lost. We are using jglobus (1.8.x), two-phase commit and we poll the LRM (LoadLeveler -- not using scheduler event generator). Any idea what could cause those files to be deleted? Best, Markus --------------------------------- Markus Binsteiner Systems & Software developer NeSI / Centre for eResearch / the University of Auckland
