Joseph Bester wrote:
On Oct 20, 2008, at 7:40 PM, John Sanabria wrote:
Hi,

I'm developing a platform for executing jobs using traditional Globus commands such as 'globus-job-run' and 'globus-job-submit'. Now, when a user decides to make an asynchronous execution, the platform queries periodically for the job status to the remote resource using the 'globus-job-status' command.
I'm executing tasks lasting more than 5 days.
I have noted that approximately one day or less after I start the execution, the 'globus-job-status' command returns 'DONE' but the job hasn't finished. Is this normal behavior? I read the paper 'The Gridway Framework For Adaptive Scheduling And Execution Grids' and I found this:

"The job manager is probed periodically at each polling. If the job manager does not respond, the GRAM gatekeeper is probed. If the gatekeeper responds, a new job manager is started to resume watching over the job. If the gatekeeper fails to respond..."

According that, I think that this behavior is not abnormal, but I don't know how to query the GRAM gatekeeper and what message send to it for requesting that it starts a new job manager for watching a job.

I appreciate your comments, advice and pointers to documentation about this topic.

Cheers,


I wonder if the proxy you have delegated to the GRAM is expiring after the day? Are you creating a proxy with a long enough lifetime to last for the whole jobs?

Joe


Hi Joe,

yes I did, the credential will expire on June 2009.

John

Reply via email to