Am 13.01.2014 um 19:02 schrieb Julien Nicoulaud:

> They are not scheduled, this is the output of "qstat -j" on the queued 
> waiting job. If I cancel and submit it again, it it scheduled.
> 
> 
> 2014/1/13 Reuti <[email protected]>
> Am 12.01.2014 um 18:50 schrieb Julien Nicoulaud:
> 
> > Hi all,
> >
> > My qmaster host was rebooted brutally a while back ago, and I observe 
> > strange behaviour since then. Some jobs "randomly" get stuck in queue with 
> > this kind of "qstat -j":
> >
> >     > project: myproject
> >     > ...
> >     > (no project) does not have the required project to run in queue foo.q

This is strange. First of all I noticed that this is always output, even when a 
project is attached under normal conditions:

project:                    foo
scheduling info:            (no project) does not have the correct project to 
run in cluster queue "all.q"
                            (no project) does not have the correct project to 
run in cluster queue "extra.q"

Then I tried to investigate where this is called. It's obviously the output 
"MSG_SCHEDD_INFO_HASNOPRJ_S" which is called in `sge_get_schedd_text`. And 
guess what: `sge_get_schedd_text` is never called at all - at least I can't see 
where.

Maybe what you face is not related to the hard reset but a different issue.

-- Reuti


> And this happens during the job execution? I would understand if it would 
> prevent any jobs from being scheduled at all. Where do you get this output?
> 
> -- Reuti
> 
> 
> > The job is launched with the right, and the queue is also configured right.
> > Any idea where to look at ? Is there some cleanup I can do ?
> >
> > Regards,
> > Julien
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to