the node the orphaned jobs were working on . Not the controller or db machine.

Sefa ARSLAN
Arastirmaci
Ag Teknolojileri Birimi
TUBITAK ULAKBIM
YOK Binasi B5 Blok Kat:3 Bilkent
06539 ANKARA
T +90 312 298 9397
F +90 312 298 9397
................................................................................................................................

TUBITAK-ULAKBIM

On 06/05/2013 05:48 PM, Paul Edmon wrote:
Do you mean the node that hosts the slurmdb? Or the node that runs 
slurmctld?  Or are you speaking of the nodes on which that job ran?

-Paul Edmon-

On 06/05/2013 10:45 AM, Sefa Arslan wrote:
if possible, rebooting the workerker node is the fastest solution.


On 06/05/2013 05:10 PM, Paul Edmon wrote:
I have a job which shows up in sacct as Running but does not show up on
squeue or any other probe of the cluster jobs.  I know this job is long
dead but sacct is under the impression it is still running. I suspect
that this is due to me having to rebuild my database while in
production.  However, I've done this before and hadn't seen this issue
crop up.  Is there a way to remove this job from sacct? scancel does not
work on it.

-Paul Edmon-

Reply via email to