[slurm-users] epilog when job is killed for max time

2018-11-07 Thread Noam Bernstein
Hi slurm users - I’ve been looking through the slurm prolog/epilog manuals, but haven’t been able to figure out if there’s a way to get an epilog script (requested by the user) to run when a job is killed for running out of time, and have the epilog script be able to detect that (through an env

Re: [slurm-users] srun problem -- Can't find an address, check slurm.conf

2018-11-07 Thread Paul Edmon
Yeah, these are frustrating ones to troubleshoot.  When I have seen this in the past it was usually a missing forward or reverse in DNS that cause the problem.  You could try dialing up the verbosity all the way and see what you can spot.  Else I might recommend dropping a ticket into the

Re: [slurm-users] srun problem -- Can't find an address, check slurm.conf

2018-11-07 Thread Scott Hazelhurst
Thanks, Paul, yes, it does seem a likely cause, but I can’t see the problem. All machines have the same /etc/hosts file and the worker nodes are just listed one after each other. I’ve checked that the problem nodes are there — no obvious difference. I’ve checked that the IP address is correct.

[slurm-users] srun problem -- Can't find an address, check slurm.conf

2018-11-07 Thread Scott Hazelhurst
Dear list We have a relatively new installation of SLURM. We have started to have a problem with some of the nodes when using srun [scott@cream-ce ~]$ srun --pty -w n38 hostname srun: error: fwd_tree_thread: can't find address for host n38, check slurm.conf srun: error: Task launch for

Re: [slurm-users] Accounting - running with 'wrong' account on cluster

2018-11-07 Thread Chris Samuel
On Wednesday, 7 November 2018 3:46:01 PM AEDT Brian Andrus wrote: > Ah. I was getting ahead of myself. I used 'limits' and I have no limits > configured, only associations. Changed it to just associations and all is > good. Excellent! Well spotted.. -- Chris Samuel : http://www.csamuel.org/

Re: [slurm-users] Accounting: set default account with no access

2018-11-07 Thread Antony Cleave
Try adding a default account and then set a limit of 0 jobs on it. >From memory I think it is grpjobs This is the maximum allowed jobs this account can have queued. This requires limits to be enforced in accountingstorageenforce Or you could simply add the account to the denyaccount list for

Re: [slurm-users] Accounting: set default account with no access

2018-11-07 Thread Marcin Stolarek
I had exactly the same requirement - you can find my notes from it here; https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/ cheers, Marcin wt., 6 lis 2018 o 20:48 Sam Hawarden napisał(a): > Hi Yair, > > > You can set maxsubmitjob=0 on an account. > > > The error