If I understand your question, you can set it in the in slurm.conf file, the
default is:
MailProg = /usr/bin/mail
From: Fanny Pagés Díaz >
Reply-To: slurm-dev >
Date: Wednesday, September 28, 2016 at 11:45
Make sure the slurm.conf file is identical on all nodes.
If the slurmctld is running , and all the slurmd’s are running take a look at
the slurmctld.log, it should provide some clues, if not you might want to post
the content of your slurm.conf file.
Phil Eckert
LLNL
From: Fany Pagés Díaz
I believe that all that is happening in regard to mail, is that the
slurmctld is executing the mail utility, with the standard arguments. Is
mail set up on the node the slurmctld is running on? A quick test would
be to login there and manually send yourself email.
Phil Eckert
LLNL
On 12/18/15,
A possibility might be to do this using reservations.
You could create a 5 node reservation with all concerned users having
access, then have a script run by cron that periodically checks the state
of the node in the reservation, if any go down update the reservation
replacing the down nodes
The reason this hss a higher permission level is that a user could game the
system by submitting a job with a 1 minute time limit, which will generally get
it started very quickly because of backfill, then they could increase it to
whatever they wanted. I believe almost all batch system
Mike,
In your slurm.conf you have Procs=1, (which is the same as CPUS=1) and Sockets
(if ommited will be inferred from CPUS, default is 1) and CoresPerSocket
(default is 1)
So at this point the slurm.conf has a default configuration of 1 core per node.
Phil Eckert
LLNL
From: Michal Zielinski
If you don’t wish to do the submission from the “somepath” directory you can
use the following sbatch option to achieve what you are looking for.
-D, --workdir=directory
Set the working directory of the batch script to directory before
it is executed.
Phil Eckert
LLNL
Hartley,
Sounds like you might be wanting srun.
If I ask for 5 nodes on our rzmerl system:
salloc -p pdebug -N 5
salloc: Granted job allocation 1966117
srun hostname
rzmerl1
rzmerl2
rzmerl4
rzmerl3
rzmerl5
Phil Eckert
LLNL
From: Hartley Greenwald
Marti,
If the job is submitted using msub, the release of the dependency would be need
to be:
mjobctl -m depend=none jobid
If you use:
mjobctl -m depend= jobid
it only removes the dependency in Moab, not Slurm. This works fine if you are
using just-ini-time scheduling, since the jobs only
Bill,
In addition to what Alejandro said, there is another consideration.
You indicated the top two high priority jobs and the 30 core job, I'm
assuming that the ... indicated a number of other queued jobs ahead of
the 30 core job. Also, you didn't state it, but I'm also assuming there
were
A lot of suggestions of what to check for here:
https://groups.google.com/forum/#!topic/slurm-devel/qduhQ5EbjaQ
Phil Eckert
LLNL
On 11/21/13 5:00 PM, Arun Durvasula arun.durvas...@gmail.com wrote:
Zero Bytes were transmitted or received
I see the nodes busy message only if I am trying to create a reservation on top
of another reservation that includes the same nodes. You might try adding the
overlap flag if this is the case.
Phil Eckert
LLNL
From: Jacqueline Scoggins jscogg...@lbl.govmailto:jscogg...@lbl.gov
Reply-To:
=ignore_jobs nodes=tnodes[32-591]
starttime=now endtime=tomorrow partition=pbatch user=eckert
Phil Eckert
LLNL
From: Jacqueline Scoggins jscogg...@lbl.govmailto:jscogg...@lbl.gov
Reply-To: slurm-dev slurm-dev@schedmd.commailto:slurm-dev@schedmd.com
Date: Tuesday, November 12, 2013 10:58 AM
of the names were not
valid. So I tried only the partition and it still did not wok.
Thanks
Jackie
On Tue, Nov 12, 2013 at 11:49 AM, Eckert, Phil
ecke...@llnl.govmailto:ecke...@llnl.gov wrote:
Jackie,
I was trying this with an earlier version of SLURM, I just build a 2.5.7 test
system
I believe you have exceeded the MaxJobCount specified in your slurm.conf,
or have reached the default of 1 jobs.
MaxJobCount
The maximum number of jobs SLURM can have in its active
database at one time. Set the values of MaxJobCount and MinJobAge
to insure
Another route that could be taken is to set the DefaultTime for a
partition to 0, and the
small patch attached to this email will reject a job when is has no time
limit specified
and the default_time limit is 0. I also modified the
ESLURM_INVALID_TIME_LIMIT
to include information that the error
Have you looked at sshare?
Phil Eckert
LLNL
From: Mario Kadastik mario.kadas...@cern.chmailto:mario.kadas...@cern.ch
Reply-To: slurm-dev slurm-dev@schedmd.commailto:slurm-dev@schedmd.com
Date: Tuesday, January 22, 2013 11:17 AM
To: slurm-dev slurm-dev@schedmd.commailto:slurm-dev@schedmd.com
I have scp'd it as moab.log.invalid.gz
On 12/11/12 1:00 PM, Moe Jette je...@schedmd.com wrote:
I would guess that your machine can communicate with the cluster's
head node (where the slurmctld daemon executes and creates the job
allocation), but not the compute nodes (where the slurmd daemons
In the sbatch code, it checks to see if a job name is provided, if so it
will set the SLURM_JOB_NAME environment variable, but since the overwrite
argument of the call is 0, it doesn't do so if the variable is already
set, which is the case you are running into once the first job is
submitted.
According to adaptive this change was introduced in:
5_4 branch as of the .0 version changeset 7922ced7105a79a3
Phil Eckert
LLNL
On 6/6/12 1:29 PM, Eckert, Phil ecke...@llnl.gov wrote:
In Moab 6.1 and later the Moab wiki does filter out the quotes in the data
it gets from SLURM. We
Michael,
I was curious, so I tried the:
RESOURCELIMITPOLICY:ALWAYS,EXTENDEDVIOLATION:NOTIFY,CANCEL:12:00:00
parameter on my test cluster so that I could observe the behavior, and I
also used the OverTimeLimit parameter in my SLURM test system. When the
initial time limit is reached, I see that
21 matches
Mail list logo