[slurm-dev] Re: QOS, Limits, CPUs and threads - something is wrong?

2016-10-03 Thread Lachlan Musicman
On 3 October 2016 at 23:26, Douglas Jacobsen wrote: > Hi Lachlan, > > You mentioned your slurm.conf has: > AccountingStorageEnforce=qos > > The "qos" restriction only enforces that a user is authorized to use a > particular qos (in the qos string of the association in the

[slurm-dev] Re: Send notification email

2016-10-03 Thread Bill Broadley
This might work for you: slurm.conf:MailProg=/usr/bin/msmtp.wrapper $ cat /usr/bin/msmtp.wrapper #!/bin/bash mytmp=$(mktemp /tmp/slurm.email.XX) echo "From: slurm@" > $mytmp echo Subject: $2 >> $mytmp echo To: $3 >> $mytmp echo >> $mytmp echo $2 >> $mytmp cat $mytmp | /usr/bin/msmtp -a

[slurm-dev] NODE_FAIL triggered by `not found BatchStartTime after startup`

2016-10-03 Thread John Lin
Background: I am running into issues where a job is cancelled and re-queued. When I look into the slurmctld.log, I see the following relevant lines: [2016-09-30T11:55:24.555] _slurm_rpc_submit_batch_job JobId=79707529 usec=560 [2016-09-30T12:47:15.326] Recovered JobID=79707529 State=0x0

[slurm-dev] Re: Send notification email

2016-10-03 Thread Fanny Pagés Díaz
Hello and thanks in advance, I will try to explain my current configuration in details. I have a slurm running in the same HPC cluster server, but I need send all notification using my corporate mail server, which running in another server at my internal network. I not need use the local

[slurm-dev] Re: QOS, Limits, CPUs and threads - something is wrong?

2016-10-03 Thread Douglas Jacobsen
Hi Lachlan, You mentioned your slurm.conf has: AccountingStorageEnforce=qos The "qos" restriction only enforces that a user is authorized to use a particular qos (in the qos string of the association in the slurm database). To enforce limits, you need to also use limits. If you want to prevent