Hi Malk,
On Fri, Nov 3, 2017 at 2:14 AM, Maik Schmidt wrote:
> It is my understanding that when ConstrainDevices is not set to "yes", SLURM
> uses the so called "Minor Number" (nvidia-smi -q | grep Minor) that is the
> number in the device name (/dev/nvidia0 -> ID 0
Hi Dave,
On Fri, Oct 27, 2017 at 3:57 PM, Dave Sizer wrote:
> Kilian, when you specify your CPU bindings in gres.conf, are you using the
> same IDs that show up in nvidia-smi?
Yes:
$ srun -p gpu -c 4 --gres gpu:1 --pty bash
sh-114-01 $ cat /etc/slurm/gres.conf
name=gpu
On Fri, Oct 27, 2017 at 12:45 PM, Dave Sizer wrote:
> Also, supposedly adding the "--accel-bind=g" option to srun will do this,
> though we are observing that this is broken and causes jobs to hang.
>
> Can anyone confirm this?
Not really, it doesn't seem to be hanging for
Hi Michael,
On Fri, Oct 27, 2017 at 4:44 AM, Michael Di Domenico
wrote:
> as an aside, is there some tool which provides the optimal mapping of
> CPU id's to GPU cards?
We use nvidia-smi:
-- 8<
Hi Dave,
On Wed, Oct 25, 2017 at 9:23 PM, Dave Sizer wrote:
> For some reason, we are observing that the preferred CPUs defined in
> gres.conf for GPU devices are being ignored when running jobs. That is, in
> our gres.conf we have gpu resource lines, such as:
>
> Name=gpu
On Thu, Aug 10, 2017 at 10:31 AM, Kilian Cavalotti
<kilian.cavalotti.w...@gmail.com> wrote:
> Do you use cgroups in your Slurm setup with pam_systemd on nodes? And
> if so, did you notice any issue with cgroups?
For what it's worth, I just checked again with Slurm 17.02 and CentOS
Hi Bill,
On Thu, Aug 10, 2017 at 5:33 AM, Bill Barth wrote:
> If you add the same line from /etc/pam.d/system-auth (or your OS’s
> equivalent) to /etc/pam.d/slurm, then srun- and sbatch-initiated shells and
> processes will also have the directory properly set up.
Hi Ole,
On Fri, Jun 23, 2017 at 1:26 AM, Ole Holm Nielsen
wrote:
> Yes, ClusterShell has indeed lots of features and compares favorably to
> PDSH. I've added a brief description in my Slurm Wiki
> https://wiki.fysik.dtu.dk/niflheim/SLURM#clustershell, please comment
On Thu, Jun 22, 2017 at 2:11 AM, Kent Engström wrote:
> slighly off topic, but if you are willing to install and use an external
> program that is not part of SLURM itself, I might perhaps be allowed to
> advertise the python-hostlist package?
And I'd like to also advertise the
Hi Barry,
On Thu, Jun 15, 2017 at 9:16 AM, Barry Moore wrote:
> Does anyone have a script or knowledge of how to query wait times for Slurm
> jobs in the last year or so?
With the help of histogram.py from
https://github.com/bitly/data_hacks, you can have a one-liner:
$
On Wed, Apr 19, 2017 at 12:52 AM, gaoxinglong9...@163.com
wrote:
> Hi:
> I have 5 nodes and there are 4 K80 NVIDA GPU on each of them. Now I want to
> use 6 gpu to execute some tasks, and I want to 4 of the first node and 2 of
> the second node, Can I do this by using
Hi Janne,
On Thu, Apr 13, 2017 at 1:32 AM, Janne Blomqvist
wrote:
> Should work as of 16.05 unless you have some very peculiar setup. IIRC I
> submitted some patch to get rid of the enumeration entirely, but
> apparently SchedMD has customers who have multiple groups
Hi Wensheng,
On Thu, Apr 13, 2017 at 6:23 AM, Wensheng Deng wrote:
> Hi, several months ago when I started learning Slurm and reading through the
> web pages, I made this picture to help myself understanding the *prolog and
> *epilog interactions with job steps. Please see the
On Thu, Dec 15, 2016 at 11:47 PM, Douglas Jacobsen wrote:
>
> There are other good reasons to use jobacct_gather/cgroup, in particular if
> memory enforcement is used, jobacct_gather/linux will cause a job to be
> terminated if the summed memory exceeds the limit, which is OK
On Tue, Nov 1, 2016 at 7:34 AM, Taras Shapovalov
wrote:
> Yeah, PrivateData does not really help here. Would be useful to see in the
> future such PrivateData options like 'group' or 'account'.
I'd suggest submitting this as an enhancement request at
I guess the best way is to have the same file on all the nodes. Even
if each node has a different Gres configuration, by using the
NodeName= parameter, each node will find its own relevant
configuration in the file.
Much easier to manage this way.
Cheers,
--
Kilian
Hi Giuseppe,
It needs to be on every single node in your Slurm cluster:
http://slurm.schedmd.com/gres.conf.html
"""
DESCRIPTION
gres.conf is an ASCII file which describes the configuration of
generic resources on each compute node. Each node must contain a
gres.conf file if generic resources
On Fri, Aug 12, 2016 at 2:26 PM, Ryan Novosielski wrote:
> Curious, to anyone reading this — anyone think that this could be the same
> bug: https://bugs.schedmd.com/show_bug.cgi?id=2493
Do you see any sign of a segfault in the node's dmesg?
Cheers,
--
Kilian
On Thu, Aug 11, 2016 at 9:44 PM, Ryan Novosielski wrote:
> [pid 11767]
> open("/sys/fs/cgroup/devices/slurm/uid_109366/job_5377709/devices.allow",
> O_WRONLY) = 10
> [pid 11767] write(10, "501835 rwm", 10) = -1 EINVAL (Invalid argument)
> [pid 11767] close(10)
On Thu, Aug 11, 2016 at 12:46 PM, Ryan Novosielski wrote:
> I’ll try adding the Gres debugging, but is there some way to figure out what
> this alleged device “819275” is (this number will change with each job).
Weird, indeed. /dev/nv* devices should be 195:x, and slurmd
Hi Ryan,
You probably shouldn't have /dev/nvidia* devices listed in
cgroup_allowed_devices_file.conf, Slurm will automatically manage them
and add them to the list of authorized devices in a cgroup when a job
starts. For reference, our cgroup_allowed_devices_file.conf contains
this:
/dev/null
Hi Tom,
On Tue, Jun 21, 2016 at 5:44 AM, Tom Deakin wrote:
> I’m having trouble getting SLURM to choose the 2nd GPU on this node.
> If I then run srun --gres=gpu:gtx580 I get CUDA_VISIBLE_DEVICES=0
> If I also run srun --gres=gpu:gtx680 I get CUDA_VISIBLE_DEVICES=0
On Tue, May 10, 2016 at 9:54 AM, Ryan Novosielski wrote:
> Problem is that the nodes come back to service without running NHC (and
> are idle for the number of seconds required to be assigned work,
> whatever tiny amount that is). There's a SLURM 16.05 fix to make sure
>
HI there,
Did you guys try to use LBNs NHC?
https://github.com/mej/nhc
You can set it up to check filesystem mounts, and Slurm has hooks to
run it to verify that nodes are ready for production.
Cheers,
--
Kilian
HI Nicholas,
On Mon, May 9, 2016 at 6:33 AM, Eggleston, Nicholas J.
wrote:
> Given that I'm writing in Python right now, Michael's solution looks freaking
> brilliant. Python really is the best all around language, my apologies to C.
I should have mentioned it, but
And there's the awesome clustershell [1], which outperforms (and has
more features than) anything I know of. It can do this, and much more:
$ nodeset -e edrcompute-42-[12-14,16]
edrcompute-42-12 edrcompute-42-13 edrcompute-42-14 edrcompute-42-16
$ nodeset -e -S '\n' edrcompute-42-[12-14,16]
Hi Daniel,
On Wed, Feb 3, 2016 at 3:33 AM, Daniel Letai wrote:
> The question is - does slurm also use the dev files to track the
> availability of the cards?
>
> I do not wish to drain any nodes with failing cards - just let slurm know
> about this dynamically so jobs
On Wed, Mar 2, 2016 at 10:12 AM, wrote:
> We want to introduce a new behavior in the way slurmd uses the
> HealthCheckProgram. The idea is to avoid a race condition between the first
> HealthCheckProgram run and the node accepting jobs. The slurmd daemon will
> initialize and
Hi Michael,
This is not currently possible, but there is a feature request for
this feature. See http://bugs.schedmd.com/show_bug.cgi?id=1725 for
details
Cheers,
--
Kilian
Hi all,
I would like to revive this old thread, as we've been bitten by this
also when moving from 14.11 to 15.08.
On Mon, Oct 5, 2015 at 4:38 AM, Bjørn-Helge Mevik wrote:
> We have verified that we can compile openmpi (1.8.6) against slurm
> 14.03.7 (with the .la files
On Thu, Feb 4, 2016 at 2:56 PM, wrote:
> They have already been removed for the next major release (version 16.05).
> See:
> https://github.com/SchedMD/slurm/commit/a49ce346ff1deda34865da45f9958df23158dff7
Very good, thanks Moe!
Cheers,
--
Kilian
Hi Ewan,
My bet is that one of the job resources is entirely consumed by the
first step, so the 2nd one waits in queue.
It's likely memory, maybe you have a DefMemPerCpu setting in your
slurm.conf? You can try to request say 4G for your whole job and then
2G for each srun steps, they should both
Hi Chris,
On Sun, Nov 29, 2015 at 9:43 PM, Christopher Samuel
wrote:
> We're looking at seeing if we can combine fair share with our existing
> quota system that uses GrpCPUMins.
>
> However, for fair share a decay factor is strongly suggested and I worry
> that there is
Hi Vsevolod,
On Tue, Nov 17, 2015 at 11:43 PM, Vsevolod Nikonorov
wrote:
> Is it possible to limit a number of nodes allocated simultaneously to a given
> user?
Yes, that would be the MaxNodesPerUser QOS limit.
Please note that there's some limitations here: if 2 jobs
On Tue, Sep 8, 2015 at 5:01 AM, Marcin Stolarek
wrote:
> using specified mountpoint, but... thats not real IOPS threshold. Currently
> I don't how any linux mechanism that allows limitting process to specified
> number of I/O operations per second. At our side we've
Oh I don't think it's specific to Lustre. You can limit I/O on any
kind of filesystem using the blkio controller in a cgroup. See
https://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/
for example.
Cheers,
--
Kilian
Hi Eric,
If you use slurmdbd, that usually means you have runaway jobs in the
Slurm DB, ie. jobs that are not running anymore (don's show up in
squeue), but don't have an end date and/or are still considered
running in sacct.
Phil Eckert posted a perl script to detect such jobs some time ago:
On Tue, Jun 9, 2015 at 6:19 AM, Roche Ewan ewan.ro...@epfl.ch wrote:
It’ll be interesting to see how many codes break if we get than chance to
change the 0 based numbering in a future CUDA release.
All of them? ;)
AFAIK, the current idea is to provide a switch that would allow to
choose
Hi Ewan,
On Mon, Jun 8, 2015 at 2:39 AM, Roche Ewan ewan.ro...@epfl.ch wrote:
The underlying problem seems to be that SLURM isn’t correctly setting
CUDA_VISIBLE_DEVICES to match the device allowed by the cgroup.
Slurm actually does the right thing. The real culprit here is the NVML.
So for
Hi Michael,
On Wed, May 20, 2015 at 7:37 AM, Michael Jennings m...@lbl.gov wrote:
Unfortunately the demand for Docker is growing rapidly, largely due to
papers such as this one: http://arxiv.org/pdf/1410.0846.pdf which
tout Docker images as a prudent deliverable for research scientists
On Tue, May 19, 2015 at 9:28 AM, Michael Jennings m...@lbl.gov wrote:
What Chris is asking for, I *think*, is what we're looking for as well --
anyone who has figured out a way to allow users to execute jobs inside
user-supplied (or at least user-specified) Docker containers. It would be
Hi David,
On Tue, May 19, 2015 at 1:40 PM, David Bigagli da...@schedmd.com wrote:
You can create a user inside a docker machine just like any other and then
just ssh to it.
You can, but nothing forces you to. :)
I guess it's a matter or how much you trust your users, then.
Cheers,
--
Kilian
Hi Bjørn-Helge,
I think this was fixed in commit 837c360 [1], which is in 14.11.x versions.
[1]
https://github.com/SchedMD/slurm/commit/837c360f671142f36a434235a7c8488631e481de
Cheers,
--
Kilian
On Tue, Mar 17, 2015 at 6:50 AM, Bjørn-Helge Mevik
b.h.me...@usit.uio.no wrote:
While testing
On Sat, Mar 14, 2015 at 6:26 AM, Jason Bacon jwba...@tds.net wrote:
How about SLURM_ARRAY_MIN_TASK_ID, SLURM_ARRAY_MAX_TASK_ID and
SLURM_ARRAY_NUM_TASKS?
That looks even better. I kind of overlooked what Moe pointed out,
that tasks ids don't have to be continuous. So max_id and num_tasks
On Fri, Mar 13, 2015 at 6:34 PM, Jason Bacon jwba...@tds.net wrote:
by creating a new variable such as SLURM_ARRAY_NUM_TASKS?
On Fri, Mar 13, 2015 at 7:17 PM, Moe Jette je...@schedmd.com wrote:
How about this for a name?
SLURM_ARRAY_MAX_TASK_ID
I like SLURM_ARRAY_NUM_TASKS better, it's more
Hi Jared,
On Wed, Jan 14, 2015 at 2:14 PM, Jared David Baker jared.ba...@uwyo.edu wrote:
NodeName=loren[01-60] Name=gpu Type=k20x File=/dev/nvidia[0-3]
I don't think you can aggregate multiple GPUs on a single line (at
least that was the case in 14.03). So you would have to split it up
over 4
Hi Sefa,
This is not currently implemented, but it's being discussed here:
http://bugs.schedmd.com/show_bug.cgi?id=858
Cheers,
--
Kilian
Hi Sefa,
On Thu, Nov 6, 2014 at 1:04 AM, Sefa Arslan sefa.ars...@tubitak.gov.tr wrote:
In order to update the node list of a partition, I use a command like
scontrol update partition=part1 nodes=node[A-B,F,K-H,...]
Is there a way to add/remove a single node from a partition without
Hi Ian,
That doesn't answer your question about prolog scripts, but for that
sort of checks, we use NHC
(http://warewulf.lbl.gov/trac/wiki/Node%20Health%20Check). It
integrates very well with Slurm and provides all sorts of ready-to-use
checks.
Cheers,
--
Kilian
Hi,
On Tue, Sep 23, 2014 at 7:18 AM, Yann Sagon ysa...@gmail.com wrote:
To lower the problem of having to deal with two queues, you can specify the
two queues like that when you submit a job : --partition=queue1,queue2 and
the first one that is free is selected.
You can even define an env
Hi Brian,
On Fri, Sep 12, 2014 at 9:58 AM, Brian B for...@gmail.com wrote:
I am trying to setup my slurm setup to use MySQL. I installed via
pre-compiled RPMs but I am having trouble actually loading the plugin as it
isn’t being installed from he RPMs I currently have. I see documentation
Hi Jesse,
Just a shot in the dark, but do you use task affinity or CPU binding?
Cheers,
--
Kilian
On Tue, Aug 12, 2014 at 6:56 PM, Trey Dockendorf treyd...@tamu.edu wrote:
This is slurm-14.03.6 running CentOS 6.5 kernel 2.6.32-431.23.3.el6.x86_64
Exact same behavior here, same Slurm version and same kernel.
Cheers,
--
Kilian
On Wed, Aug 13, 2014 at 10:00 AM, David Bigagli da...@schedmd.com wrote:
For some reason at the first attempt rmdir(2) returns EBUSY.
Would writing to memory.force_empty before calling rmdir() help?
See
http://lxr.free-electrons.com/source/Documentation/cgroups/memory.txt?v=2.6.32#L269
On Tue, Jul 8, 2014 at 1:54 PM, je...@schedmd.com wrote:
It looks like two places needed trivial changes (changed from 8 to 16 bit
fields). See:
https://github.com/SchedMD/slurm/commit/9bd58eec0b511fb7e054ca87dcb0a65938253f5f
Thanks!
I get this will be in 14.03.5?
Cheers,
--
Kilian
Hi all,
I'm currently seeing a behavior I don't understand using MaxNodesPerUser in
a QoS setting.
The sacctmgr(1) documentation states:
SPECIFICATIONS FOR QOS
[...]
MaxNodesPerUser
Maximum number of nodes each user is able to use.
I'm using the following QOS:
# sacctmgr
56 matches
Mail list logo