On 9/20/23 01:39, Feng Zhang wrote:
Restarting the slurmd dameon of the compute node should work, if the
node is still online and normal.

Probably not. If the filesystem used by the job is hung, the node must probably be rebooted, and the filesystem must be checked.


On Tue, Sep 19, 2023 at 8:03 AM Felix <fe...@itim-cj.ro> wrote:


I have a job on my system which is running more than its time, more than
4 days.

1808851     debug  gridjob  atlas01 CG 4-00:00:19      1 awn-047

I'm trying to cancel it

[@arc7-node ~]# scancel 1808851

I get no message as if the job was canceled but when getting information
about the job, the job is still there

[@arc7-node ~]# squeue | grep awn-047
             1808851     debug  gridjob  atlas01 CG 4-00:00:19 1 awn-047

Can I do any other thinks to kill end the job?

Reply via email to