"Dicker, Edwin" <[email protected]> writes:

> Hi All,
>
> One of my customers is running sge 6.5u2 and when jobs are submitted
> they are stuck in a 't' state showing with qstat -f. However they do
> seem to have finished ( all output is as expected )

That sounds like a bug, assuming the qmaster is still communicating with
the execd.

> I've been looking around and do not find a lot of info about jobs in
> (t)ransfer state and what it means.  How should one debug those jobs
> and find out why they stick in that state? Is there more info about
> what this state means and what sge is doing at that time?

I don't know the exact circumstances when you'll see it -- maybe someone
else will say -- but it basically means the qmaster has started sending
the job to the execd, but it hasn't been started.

I've seen it happen, for instance, if the execd crashes or the user
isn't properly defined on the exec host (missing passwd entry there).
First check the host's GE messages file and its syslog.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to