Title: RE: [Mauiusers] completed jobs still shown in queue

No - I don't have any epilogue scripts configured. The script I was running was very simple:

$ more pbs_test.sh
#!/bin/bash
echo "Hello from $(uname -n)"
sleep 20
printenv | egrep "PBS_NODENUM|PBS_VNODENUM|PBS_TASKNUM|PBS_O_HOST" | sort
echo " "
exit 0


Prentice



-----Original Message-----
From: Matney Sr, Kenneth D. [mailto:[EMAIL PROTECTED]]
Sent: Wed 3/1/2006 2:16 PM
To: Bisbal, Prentice
Subject: RE: [Mauiusers] completed jobs still shown in queue

Is it possible that MOM was running an epilog on behalf of
the job in this time interval?  For example, an epilog that
removes scratch areas that are NFS mounted to all of
your compute nodes might cause a delay between when
PBS records an exit status for the job and the job is marked
complete at the server.

Just curious.  -- Ken Matney, Sr.

________________________________

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]] On Behalf Of Bisbal,
Prentice
Sent: Wednesday, March 01, 2006 1:30 PM
To: [EMAIL PROTECTED]; [email protected]
Subject: RE: [Mauiusers] completed jobs still shown in queue



qdel didn't work for me - something about the job being in an invalid
state for that operation.

All the jobs involved were on a system that was very loaded (8 cpus, all
at 99% usage). I suspect the heavy loading of the system caused delays
in communication which in turn caused some sort fo message time out.

Prentice



-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]
Sent: Wed 3/1/2006 12:45 PM
To: Bisbal, Prentice; [email protected]
Subject: RE: [Mauiusers] completed jobs still shown in queue

We se the same behavior periodically.  We are running torque-1.2.0p1 and
maui-3.2.6p11.  Not only is this an anoyance, but it also prevents maui
from scheduling jobs on those nodes.  Most of the time you can qdel
them.

                Stewart

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Bisbal, Prentice
Sent: Wednesday, March 01, 2006 10:03 AM
To: [email protected]
Subject: [Mauiusers] completed jobs still shown in queue



I have 4 simple jobs stuck in my queue. The jobs ran to completion, but
they are still shown as being in the queue:


$ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING
STARTTIME

3183                pxxxxxx    Running     1    00:46:01  Wed Mar  1
09:44:58
3184                pxxxxxx    Running     1    00:46:04  Wed Mar  1
09:45:01
3185                pxxxxxx    Running     1    00:46:04  Wed Mar  1
09:45:01
3186                pxxxxxx    Running     1    00:46:04  Wed Mar  1
09:45:01

     4 Active Jobs       4 of   22 Processors Active (18.18%)
                         1 of    7 Nodes Active      (14.29%)

A tracejob shows that these jobs completed and exited w/o any errors:

$  tracejob 3186

Job: 3186.hw-emperor.lexpharma.com

03/01/2006 09:43:38  S    enqueuing into batch, state 1 hop 1
03/01/2006 09:43:38  S    Job Queued at request of
                          [EMAIL PROTECTED] owner =
                          [EMAIL PROTECTED], job name =
                          PBS_TEST.87, queue = batch
03/01/2006 09:45:02  S    Job Modified at request of
                          [EMAIL PROTECTED]
03/01/2006 09:45:02  S    Job Run at request of
[EMAIL PROTECTED]
03/01/2006 09:45:33  S    Exit_status=0 resources_used.cpupercent=0
                          resources_used.cput=00:00:00
resources_used.mem=5408kb
                          resources_used.vmem=9280kb
                          resources_used.walltime=00:00:30

Any idea why these jobs are still shown in the queue? What is the best
way to get rid of them?

Prentice






_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to