Hi Reuti - Thanks for the quick reply - yes, the jobs do get suspended if the Q instance gets suspended (I made a mistake in checking this), but there seemed to be no way to kill them as a last resort.
I'll check the checkpointing link that sounds like a way to handle the problem more elegantly. hjm On Tuesday 07 February 2012 12:47:36 Reuti wrote: > Am 07.02.2012 um 20:58 schrieb Harry Mangalam: > > I run a cluster that is a mostly peaceful mix of open > > (universally available, under SGE 6.2) and condo nodes > > (generally open, except when the owners want them under their > > control and available only to them. I've assigned a node Q to > > the owner who can disable/enable & suspend/resume the QUEUES > > according to the docs. > > You mean the jobs are not suspended, despite the fact that the > queue instance got suspended they are running in? > > You could use a custom suspend method to kill the jobs instead > suspending. Or maybe better: attach in an JSV a checkpointing > environment. This way the jobs would stay at the top of the queue, > if the checkpointing environment is setup to reschedule on > suspend. You are using the checkpointing facility only for the > removal of the jobs from the node, i.e. for a migration. > > http://arc.liv.ac.uk/SGE/howto/checkpointing.html > > -- Reuti > > > Is there a mechanism to allow the Q owner to suspend or even > > kill running JOBS or is that forbidden to the Q owner and only > > available to the admin? > > > > ie in the following extract, argardne is the owner of the Q on > > node claw9, but he can't kill/suspend jobs running there - he > > can only operate on Qs. $ qconf -sq claws > > qname claws > > hostlist @execlaws > > seq_no 0 > > load_thresholds np_load_avg=1.1 > > suspend_thresholds NONE > > nsuspend 1 > > suspend_interval 00:05:00 > > priority 0 > > min_cpu_interval 00:05:00 > > processors > > 1-4,[claw1.bduc=1-2],[claw5.bduc=1-8],[claw7.bduc=1-8], \ > > > > [claw8.bduc=1-16],[claw9.bduc=1-48] > > > > qtype > > BATCH,[claw1.bduc=BATCH],[claw5.bduc=BATCH], \ > > > > [claw9.bduc=BATCH],[claw8.bduc=BATCH],[claw > > 7.bduc=BATCH] > > > > ... > > owner_list NONE,[claw9.bduc=argardne] > > user_lists > > arusers,[claw9.bduc=arusers],[claw5.bduc=arusers], \ > > > > [claw7.bduc=arusers],[claw1.bduc=arusers] -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) -- Citzens United: Democracy on meth - Walter Egan
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
