Re: [slurm-users] fail job

2020-06-30 Thread Alberto Morillas, Angelines
ob (Gesti? Servidors) -- Message: 1 Date: Tue, 30 Jun 2020 08:55:01 + From: Gesti? Servidors To: "slurm-users@lists.schedmd.com" Subject: Re: [slurm-users] fail job Message-ID: Con

Re: [slurm-users] fail job

2020-06-30 Thread Gestió Servidors
Can you post, also, slurmdctl.conf log file from server (controller)?

Re: [slurm-users] fail job

2020-06-30 Thread Durai Arasan
Hi, Can you post the output of the following commands on your master node?: sacctmgr show cluster scontrol show nodes Best, Durai Arasan Zentrum für Datenverarbeitung Tübingen On Tue, Jun 30, 2020 at 10:33 AM Alberto Morillas, Angelines < angelines.albe...@ciemat.es> wrote: > Hi, > > > > We

[slurm-users] fail job

2020-06-30 Thread Alberto Morillas, Angelines
Hi, We have slurm version 18.08.6 One of my nodes is in drain state Reason=Kill task failed [root@2020-06-27T02:25:29] In the node I can see in the slurmd.log 2020-06-27T01:24:26.242] task_p_slurmd_batch_request: 963771 [2020-06-27T01:24:26.242] task/affinity: job 963771 CPU input mask for