I have public queue *intel_all.q* and private queue *namd.q* with
*subordinate_list intel_all.q=1*
Some nodes of namd.q included in intel_all.q, that have
*suspend_method /storage/Scripts/job_resubmit.sh $job_id*
cat /storage/Scripts/job_resubmit.sh
#!/bin/sh
/storage/SGE/bin/lx24-amd64/qresub $1
/storage/SGE/bin/lx24-amd64/qdel $1
When even 1 job from private queue submitted, public jobs have to be
resubmitted and killed.
Sometimes it doesn't work, they got status S (suspend)
sge143 lx24-amd64 24 45.65 47.3G 30.4G
48.0G 0.0
namd.q BIP 24/24
intel_all.q BIP *23/24 S*
5219266 0.50511 SemanticEx alexla *S* 12/29/2011 16:34:08
intel_all.q@sge143 1
and stiil actually running and take resources of the node (CPU & memory).
How I can solve this problem?
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users