for sensitive data, in which there should be
no (or as few as possible) possibilities for information to leak between
jobs.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
a given
node has a reboot pending?
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
We are in the process of setting up a new cluster. It is supposed to go
in production by the end of June, and we would prefer to have slurm 2.4
on it.
Do you have any plans/ideas for when 2.4 (or at least release candidates
for 2.4) will be out?
--
Regards,
Bjørn-Helge Mevik, dr. scient
Moe Jette je...@schedmd.com writes:
We hope to have a v2.4.0-rc1 in a couple of weeks and release 2.4 a
few weeks later.
Very nice!
--
Cheers,
Bjørn-Helge Mevik
of the generic resources
that have allocated to the job.
Which one is correct? (I'm voting on man srun. :)
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
Weight=1027
BootTime=2012-05-08T15:07:08 SlurmdStartTime=2012-05-25T10:30:10
(This is with 2.4.0-0.pre4.)
(We are planning to use cx-y instead of compute-x-y (the rocks default)
on our next cluster, to save some typing.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services
ThreadsPerCore=1 TmpDisk=0 Weight=666
BootTime=2012-06-13T16:20:49 SlurmdStartTime=2012-07-02T16:32:31
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
] done with job
[2012-07-03T17:11:02] error: stat_jobacct for invalid job_id: 195
[2012-07-03T17:11:02] debug: _rpc_terminate_job, uid = 501
[2012-07-03T17:11:02] debug: task_slurmd_release_resources: 195
Is there something wrong here, or are we doing something wrong?
--
Regards,
Bjørn-Helge
% of the time. I can send it to you if
you like.
Yes, I'd very much like that! Jobs killed by memory limit is quite
common on our cluster, and users get confused if there is no message
telling them why the job died.
Thanks for a very informative answer!
--
Regards,
Bjørn-Helge Mevik, dr. scient
project on google code called oom-detect.lua. You can browse the
code here:
Thanks!
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
8
c11-13
(There is a node-failure in there, and the job failed when it finally
got to run long enough.) Apart from a short period around 21:00 the 10., less
than 7,000 of the ~ 10,000 cores were used.
--
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
Christopher Samuel sam...@unimelb.edu.au writes:
On 18/01/13 19:53, Bjørn-Helge Mevik wrote:
I don't know if this is the reason in your case, but note that cgroup
in slurm constrains_resident_ RAM, not_allocated_ (virtual) RAM.
Hmm, as a sysadmin that doesn't seem very useful,
Hmm
, and
then polls the queue system until the job has finished.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
, then
launch the main program, and then perhaps do some cleanup afterwards.
Thus one wouldn't want the job script itself to be run in parallell.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
, CUDA_VISIBLE_DEVICES gets the
value 0,1633906540
Is this correct? Are we doing something wrong?
(This is slurm 2.4.3, running on Rocks 6.0 based on CentOS 6.2.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Gary Brown gbr...@adaptivecomputing.com writes:
FYI, the value 1633906540 in hex is 61636f6c, which is ASCII acol and
usually points to some kind of buffer overrun bug.
Thanks for the tip!
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
lines solved the problem. Thanks!
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Kevin Abbey kevin.ab...@rutgers.edu writes:
This sounds great. If you can share after testing it would be be very
much appreciated.
Will do. (There will be some parts of it tailored to our site, but that
shouldn't be hard to remove/change.)
--
Regards,
Bjørn-Helge Mevik, dr. scient
to
time out after MessageTimeout/2 seconds, but looking at the code for
2.5.6 this seems to have changed.
Keep us posted about what you find. I'm planning to switch to 2.5.6
tomorrow, and have from time to time had problems getting the
backfilling to be fast enough.
--
Regards,
Bjørn-Helge Mevik
. So in order for allowing users to specify a delay of, say,
12 hours, one must set max_switch_wait in slurm.conf to something as
large as 12 hours.
Is this the right interpretation?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Moe Jette je...@schedmd.com writes:
Yes, that is correct.
Thanks!
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
ESLURM_INVALID_TIME_LIMIT
end
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Moe Jette je...@schedmd.com writes:
Quoting Bjørn-Helge Mevik b.h.me...@usit.uio.no:
Moe Jette je...@schedmd.com writes:
* Changes in Slurm 13.12.0pre1
==
Just curious: Why the sudden jump in version numbering? year.month?
Correct. We're Ubuntu fans
-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
jobs that were
_running_ (or _started_) in an interval, but I don't think it's there.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Danny Auble d...@schedmd.com writes:
It would have been nice to have the possibility to select jobs that
were
_running_ (or _started_) in an interval, but I don't think it's there.
Just ask for the state to be 'running'.
slaps palm on head / :)
--
Regards,
Bjørn-Helge Mevik, dr. scient
:00). Running
is also considered eligible.
I totally agree your comment on that sacct lacks on the way to filter jobs
that are actually within the time interval.
As Danny said: add --state=RUNNING. :)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University
[2013-09-20T04:50:16+02:00] debug: power_save module disabled, SuspendTime 0
[2013-09-20T04:50:16+02:00] error: Error binding slurm stream socket: Address
already in use
[2013-09-20T04:50:16+02:00] fatal: slurm_init_msg_engine_addrname_port error
Address already in use
--
Regards,
Bjørn-Helge
production cluster until we know it's safe.
If they are not needed, perhaps it would be a good idea for slurmctld to
close them when starting the prologs/epilogs?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
] attempting to run epilog
[/hpc/sbin/epilog_slurmd]
[2013-09-30T04:11:10+02:00] debug: completed epilog for jobid 3371606
[2013-09-30T04:11:10+02:00] debug: Job 3371606: sent epilog complete msg: rc = 0
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University
That would be a very interesting feature.
Similarly to what Christopher Samuel wrote, we have «hacked around» the
issue for project limits (not fairshare) by converting memory usage to
processor-equivalents and using Gold for the accounting.
--
Regards,
Bjørn-Helge Mevik, dr. scient
in the right partition both jobs start as they should.
Sorry for not checking well enough what I did!
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Just a short note about terminology. I believe processor equivalents
(PE) is a much used term for this. It is at least what Maui and Moab
uses, if I recall correctly. The resource*time would then be PE seconds
(or hours, or whatever).
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department
Thanks!
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
.noarch
# rpm -q check
check-0.9.8-1.1.el6.x86_64
Is there something else we are missing?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
be nice if the node-global destinations could be
configurable, instead of being hard-coded in the script (or at least be
set at the top of the script). For instance, on our system, the
node-global file systems are /work and /cluster, not /scratch and /home.
--
Regards,
Bjørn-Helge Mevik, dr. scient
00:01:07 01:03.980 00:02.207 01:06.187
43COMPLETED 00:01:08 01:05.230 00:02.173 01:07.403
43.batch COMPLETED 00:01:08 01:05.230 00:02.173 01:07.403
i.e., time spent in subprocesses is reported.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research
this?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Thanks for the tip!
We actually already have a setup where srun
--ntasks=$SLURM_JOB_NUM_NODES /bin/true is run at the start of every
job, so we're definitely going to look into this.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
.
In other core dumps of 14.03.9, the g_qos_count was 241 or 233, while
_bitstr_bits(tmp_qos_bitstr) was still 26.
Any help in figuring out what goes wrong (or how to fix it :) is
appreciated!
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
the .../uid_NNN directories are
removed.
Does anyone know what these messages mean? Should we just ignore them?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
running the test suite
for versions 14.03.8--14.03.10, we didn't upgrade to .10.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
I second this wish. :)
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Thanks! I'll try to apply that patch.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
the memory jobs can use. Our cgroup.conf
contains:
CgroupMountpoint=/dev/cgroup
CgroupAutomount=yes
ConstrainSwapSpace=yes
and our slurm.conf contains:
TaskPlugin=task/cgroup
ProctrackType=proctrack/cgroup
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
--
Regards,
Bjørn-Helge
time of child processes.”
In my experience, that description might not be accurate. It seems also
child processes are included, as long as the job doesn't time out. Here
is an email I wrote about it last year:
From: Bjørn-Helge Mevik b.h.me...@usit.uio.no
Subject: [slurm-dev] UserCPU etc
.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Ok, thanks.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
openmpi itself?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Thanks!
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
for
accounting.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
. Perhaps this has changed in later versions?
Also, nothing is ever easy: we want to account not CPU hours, but PE
(processor equivalents) hours.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Christopher Samuel sam...@unimelb.edu.au writes:
http://karaage.readthedocs.org/en/latest/introduction.html
Karaage looks interesting for managing projects and users. Can it
manage usage limits?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
makes Gold quite slow, so we have had to add quite a lot of
error checking and handling in the prolog and epilog scripts.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
LIMIT ***
That usually means the job tried to run longer than its --time
specification.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
files? They should only be needed with static libraries, which
slurm does _not_ install.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Thanks. Nice to know!
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
we activated
checkpointing. When slurmcltd started, the checkpointing plugin
expected some extra data in the job states, which obviously wasn't
there, and slurmctld decided the data was invalid and killed all jobs.
(I don't know if this is still a problem.)
--
Regards,
Bjørn-Helge Mevik, dr
d get
either all nodes with IB1 or all with IB2. Search for "Matching OR" in
the sbatch man page for details. (We used this on our previous cluster,
which had two different IB networks.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
e slurm-devel rpm not to include these files.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
r job shares
the same data between several processes, the shared space will be
counted once for each process(!). Cgroups seems to count the shared
data only once. So if a process is killed by oom instead of by slurm,
it is probably not due to shared data.
--
Regards,
Bjørn-Helge Mevik
the qos is
enough. We have the partition because our lowpri jobs are allowed to
run on special nodes (like hugemem or accellerator nodes) that normal
jobs are not allowed to use.)
I hope this made sense to you. :)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
oups.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
ush the cache when a
process needs more memory instead of killing the process. If I'm
correct, oom will _not_ kill a job due to cached data.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
"Wiegand, Paul" <wieg...@ist.ucf.edu> writes:
> This worked. Thank you Bjørn-Helge.
You're welcome! :)
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
automatically requeue jobs just
before they time out.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Test 14.10 in the test suite (of slurm 15.08.8, at least) uses
$sinfo -tidle -h -o%n
to find idle nodes. This only works if NodeHostname == NodeName on the
nodes. The following should work regardless of this:
$scontrol show hostnames \$($sinfo -tidle -h -o%N)
--
Regards,
Bjørn-Helge
ool?
- A locally developed solution?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
sbatch: option requires an argument -- 'J'
Submitted batch job 14221261
$
A more consistent behaviour would have been nice. My suggestion is:
report error and fail to submit the job.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
<hm...@t-hamel.fr> writes:
> We are looking for comments and feedback on this proposed behavior
[...]
> +#define HEALTH_RETRY_DELAY 10
Have you thought about using the health_check_interval instead? Or make
it a separate configurable option?
--
Regards,
Bjørn-Helge Mevik
+1
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
be
Just a note: I tried this (for a different reason), but found out it
didn't have any effect (gather the output to a log file and look at the
gcc lines). However, if I did -D '%with_cflags CFLAGS="-O0 -g3"' (i.e.,
removed the initial "_"), it had the desired effect.
--
Regar
I just noticed that as of 14.11.6, optimization is turned off (-O0) by
default when building slurm.
Is there any reason not to use --disable-debug when building slurm for a
production cluster?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
ing or a default "@localhost".
A MailDomain config parameter was added in Slurm 17.02.
A different option would be to configure your sendmail to accept
domain-less mails (and perhaps add a default domain itself).
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, Universi
ngs, this sets up a signal
handler for the EXIT "signal", which prints out resource usage. As long
as users remember to source the setup file, they get the usage
statistics in the bottom of their stdout file. Not very elegant, but it
works.
--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
To me, this sounds like a job for a job submit plugin, for instance
job_submit.lua. That way you could reject the job before it gets
submitted into the queue.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Jordan Willis <jwillis0...@gmail.com> writes:
>Thank you,
>Can you confirm that this will take an update from SLURM 14.11.15 to
>current?
I never ran 14.11, but in 14.03, you can use GrpCPUs=1000 instead of
GrpTRES=cpu=1000.
--
Regards,
Bjørn-Helge Mevik, dr. sci
s per account; if
I have access to more then one account, I can use 1000 cpus in each
account.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
I'd guess it
should have been possible to use scancel in PrologSlurmctld also in
15.08.12.
Does anyone know if this is an intentional change (and SchedMD just
forgot to update the docs) or a bug?
(I haven't found anything relevant in the NEWS file or on
bugs.schedmd.com.)
--
Regards,
Bjørn-Helge
There is a plugin under development, that will/might provide those
features. It was presented at SLUG 16:
http://slurm.schedmd.com/SLUG16/MCS.pdf
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
ry usage, etc) commands to generate
> my report. Does this approach make sense or are there better
> alternatives.
sacct can also give you the submit time, start time, end time and
elapsed time.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Check out the thread on this list about a week ago, titled "Unrestricted
use of a node". (In short, --exclusive with --mem=0 or --mem-per-cpu=0
might be more or less what you want.)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
use the "Proportional set size" [1]
(JobAcctGatherParams=UsePss), which is cgroup uses (I believe), and
sounds like the best estimate to me.
[1] https://en.wikipedia.org/wiki/Proportional_set_size
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
mber of
cores, I don't know.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
which cpus it is allowed to use.)
This is on 15.08.12. YMMV.
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
qw(:all SLURMDB_ADD_USER);
$what = SLURMDB_ADD_USER();
just gives the error "SLURMDB_ADD_USER is not a valid Slurmdb macro"
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
file names to a dot file in $SCRATCH)
- The Epilog copies any registered files back to the job submit dir (it
uses "su - $USER" when doing this).
- The epilog deletes the directory
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signa
ir job,
remove the "fixme" feature from the node, and then request themself to
be requeued.
Prior to submit the jobs, we add the "fixme" feature to all nodes
needing maintenance.
(In reality, our setup is a little mor complex, since it includes
reinstalling the os on the nodes,
e to "RESUME"), so we will be looking at
this feature again. Thanks for the tip! :)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
a
> manner that's predictable, both for the programmer and for the user.
It is by design, because people often need to give arguments or options
to their jobscript, e.g.,
sbatch --time=1-0:0:0 myjob.sh inputfile
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
; creation (return from sbatch), but looks like this assumption is wrong?
That is right, that is wrong. :)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
ool
> https://www.nsc.liu.se/~kent/python-hostlist/ by Kent Engström at NSC. It's
> simple to install this as an RPM package, see
> https://wiki.fysik.dtu.dk/niflheim/SLURM#expanding-host-lists
For the simple case you show, you could just use
$ scontrol show hostnames a[095,097-098]
a09
show the nodes and the pids.
Not that I know of, but it should be possible to script.
> And how to parse the nodelist like "cn[11033,11069],gn[1103-1120]" ?
scontrol show hostnames cn[11033,11069],gn[1103-1120]
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Com
efore modifying anything.
In our setup, the first option is preferrable; just putting it on the
queue and let it wait until it's turn. But of course, there are other
setups where the second option would be best. Could you perhaps make it
configurable, so a site can choose?
--
Regards,
Bjørn-Helge Mevik,
Life=0
to turn of the decaying of historic usage. Then you can set
FairshareWeight to 0 and use the Grp*Mins parameters to set hard limits.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
something wrong with how my partitions are defined?
That sounds unlikely, IMO.
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
s the signal
arrives. I got bit by this behaviour trying to do exactly the same that
you did. :)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
1 - 100 of 101 matches
Mail list logo