qdel didn't work for me - something about the job being in an invalid state for that operation.
All the jobs involved were on a system that was very loaded (8 cpus, all at 99% usage). I suspect the heavy loading of the system caused delays in communication which in turn caused some sort fo message time out.
Prentice
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Wed 3/1/2006 12:45 PM
To: Bisbal, Prentice; [email protected]
Subject: RE: [Mauiusers] completed jobs still shown in queue
We se the same behavior periodically. We are running torque-1.2.0p1 and maui-3.2.6p11. Not only is this an anoyance, but it also prevents maui from scheduling jobs on those nodes. Most of the time you can qdel them.
Stewart
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Bisbal, Prentice
Sent: Wednesday, March 01, 2006 10:03 AM
To: [email protected]
Subject: [Mauiusers] completed jobs still shown in queue
I have 4 simple jobs stuck in my queue. The jobs ran to completion, but they are still shown as being in the queue:
$ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
3183 pxxxxxx Running 1 00:46:01 Wed Mar 1 09:44:58
3184 pxxxxxx Running 1 00:46:04 Wed Mar 1 09:45:01
3185 pxxxxxx Running 1 00:46:04 Wed Mar 1 09:45:01
3186 pxxxxxx Running 1 00:46:04 Wed Mar 1 09:45:01
4 Active Jobs 4 of 22 Processors Active (18.18%)
1 of 7 Nodes Active (14.29%)
A tracejob shows that these jobs completed and exited w/o any errors:
$ tracejob 3186
Job: 3186.hw-emperor.lexpharma.com
03/01/2006 09:43:38 S enqueuing into batch, state 1 hop 1
03/01/2006 09:43:38 S Job Queued at request of
[EMAIL PROTECTED] owner =
[EMAIL PROTECTED], job name =
PBS_TEST.87, queue = batch
03/01/2006 09:45:02 S Job Modified at request of
[EMAIL PROTECTED]
03/01/2006 09:45:02 S Job Run at request of [EMAIL PROTECTED]
03/01/2006 09:45:33 S Exit_status=0 resources_used.cpupercent=0
resources_used.cput=00:00:00 resources_used.mem=5408kb
resources_used.vmem=9280kb
resources_used.walltime=00:00:30
Any idea why these jobs are still shown in the queue? What is the best way to get rid of them?
Prentice
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
