I'm getting messages like the following popping up in the grid engine messages 
file (SoGE 8.1.8):
02/16/2016 08:20:07|worker|nfs-2|E|invalid task number 0 for job 496841 in 
"ORT_ptickets" order
02/16/2016 08:20:07|worker|nfs-2|W|Skipping remaining 352 orders
02/16/2016 08:20:07|worker|nfs-2|E|reinitialization of "scheduler"

They persist over scheduling cycles but can generally be cleared by applying a 
qhold on the job in question followed 
by a qrls.  

They seem[1] to prevent scheduling for subsequent jobs and quite often once 
I've cleared one job another job submitted 
by the same user at about the same time will start showing up in the logs with 
the same messages.

Has anyone seen something similar and/or know what causes it?

[1]'Seem' as in this is happening on our production cluster and my priority has 
been to clear the blockage rather
than investigate in detail.

Attachment: signature.asc
Description: Digital signature

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to