On Aug 13, 2008, at 10:08 PM, Omer Jilani wrote:
Hi all,
I'm having a problem in the completion of my jobs.
They get submitted alright. But when its time for them to finish and
write the results back to output files. They never write the results
and keep on going.
The status is shown as active but the gram_job_mngr log files start
to report the following sections repeatedly.
The size of the log file grows to Mbs but it never stops. When i
call cancel on the jobs they cancel just fine.
Any idea whats going wrong? I'm really stuck here. Any help is
highly appreciated.
Following is the section that keeps repeating.
All that error means is that a condor-specific tweak to skip polling
via the perl scripts is failing and that the normal poll is occuring.
the bits
Thu Aug 14 02:52:30 2008 JM_SCRIPT: polling job 741945
Thu Aug 14 02:52:30 2008 JM_SCRIPT: Using poll cache of age 15
Thu Aug 14 02:52:30 2008 JM_SCRIPT: Job found: 741945 | r
8/14 02:52:30 JMI: while return_buf = GRAM_SCRIPT_JOB_STATE = 2
show the lcgsge.pm module is polling and finding the job to be still
active. That's module is from a third-party, so I'm not sure we can
help too much as to why it's failing to give completed job state
messages.
Joe