Am 03.04.2013 um 08:23 schrieb Joseph Farran: > Howdy. > > Using GE 8.1.2. I have two jobs which suspended correctly via Grid Engine > subordinate queue. > > I am however trying to force the scheduler to resume ( un-suspend ) the > suspended jobs with no success: > > $ qstat | grep compute-14-18 > 288279 0.50000 MakeSummar juser S 04/02/2013 16:00:43 > [email protected] 1 69788 > 288279 0.50000 MakeSummar juser S 04/02/2013 16:00:43 > [email protected] 1 69827 > 289206 0.33333 augustus_s muser r 04/02/2013 18:24:16 > [email protected] 32 > 289278 0.33333 monti_augu muser r 04/02/2013 21:08:48 > [email protected] 32
Is this defined as slotwise subordination or in the traditional way (there will be an additional S at the end of each cluster queue in `qstat -f` in the latter case)? Suspend by subordination and by `qmod` are different things. > $ qmod -usj 288279.69788 -f > root - forced enabling of job-array task 288279.69788 > > $ qstat | grep compute-14-18 > 288279 0.50000 MakeSummar juser S 04/02/2013 16:00:43 > [email protected] 1 69788 > 288279 0.50000 MakeSummar juser S 04/02/2013 16:00:43 > [email protected] 1 69827 > 289206 0.33333 augustus_s muser r 04/02/2013 18:24:16 > [email protected] 32 > 289278 0.33333 monti_augu muser r 04/02/2013 21:08:48 > [email protected] 32 > > Is there a way to force these the two jobs ( 288279.69788 and 288279.69827 ) > to resume? You could ignore the output and send a `kill -cont` on the node to the complete process group of the jobs. The output maybe still wrong then, but at least the jobs may continue. IIRC you had the opposite problem in the past: jobs were suspended but continued anyway in the process listing. -- Reuti > Thanks, > Joseph > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
