[slurm-users] missing sysconfigdir in the prefix when built manually
Hi, I have tried manually building both 19.05 and 18.08 and found there is no sysconfdir(prefix/etc) in the installation path. And I have to copy it from the build folder. Is it normal? Thanks. Fred
Re: [slurm-users] Using the manager as compute Node
On 8/5/19 8:00 AM, wodel youchi wrote: Do I have to declare it, for example with 10 CPUs and 32Gb of RAM to save the rest for the management, or will slurmctld take that in hand? You will need both to declare it and also use cgroups to enforce it so that processes can't overrun that limit. All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
[slurm-users] Using the manager as compute Node
Hi, When using the slurm manager as a compute node, what is the best way to define this node so that it's resources will not be exhausted. Suppose I have a slurm manager with 20 CPUs and 64Gb of RAM and I want to use it as a compute node without consuming all it's resources. Do I have to declare it, for example with 10 CPUs and 32Gb of RAM to save the rest for the management, or will slurmctld take that in hand? Regards.
[slurm-users] sacctmgr dump question - how can I dump entities other than cluster?
The documentation clearly states dumpDump cluster data to the specified file. If the filename is not specified it uses clustername.cfg filename by default. However, the only entity sacctmgr dump seems to accept is . Glancing over the code at https://github.com/SchedMD/slurm/blob/master/src/sacctmgr/cluster_functions.c#L1006 it doesn't seem like sacctmgr will accept anything other than a cluster name either. How can I easily dump to file qos rules, in a way that would allow me to modify and upload new qos as required? BTW, just noticed "archive" is not in the 'commands' section of sacctmgr man, but is treated as a command in later sections of the man page.
Re: [slurm-users] Slurm configuration
Hi. On 8/3/19 12:37 AM, Sistemas NLHPC wrote: Hi all, Currently we have two types of nodes, one with 192GB and another with 768GB of RAM, it is required that in nodes of 768 GB it is not allowed to execute tasks with less than 192GB, to avoid underutilization of resources. This, because we have nodes that can fulfill the condition of executing tasks with 192GB or less. Is it possible to use some slurm configuration to solve this problem? Easiest would be to use features/constraints. In slurm.conf add NodeName=DEFAULT RealMemory=196608 Features=192GB Weight=1 NodeName=... (list all nodes with 192GB) NodeName=DEFAULT RealMemory=786432 Features=768GB Weight=2 NodeName=... (list all nodes with 768GB) And to run jobs only on node with 192GB in sbatch do sbatch -C 192GB ... To run jobs on all nodes, simply don't add the constraint to the sbatch line, and due to lower weight jobs should prefer to start on the 192GB nodes. PD: All users can submit jobs on all nodes Thanks in advance Regards.
Re: [slurm-users] Slurm configuration
Hi NLHPC employee, Sistemas NLHPC writes: > Hi all, > > Currently we have two types of nodes, one with 192GB and another with > 768GB of RAM, it is required that in nodes of 768 GB it is not allowed > to execute tasks with less than 192GB, to avoid underutilization of > resources. > > This, because we have nodes that can fulfill the condition of > executing tasks with 192GB or less. > Is it possible to use some slurm configuration to solve this problem? > > PD: All users can submit jobs on all nodes Bear in mind that, you could have a situation in which the nodes for 192 GB or less and the 768 GB nodes are empty. If then jobs are submitted which require less than 192 GB are submitted, they will have to wait and you will have underutilisation of your resources. Rather than excluding the low-memory jobs completely from the high-memory nodes, you could weight the low-memory nodes such that jobs preferentially start there. You could have a shorter timelimit for low-memory jobs on high-memory nodes. In my experience it is best to encourage and assist users to estimate their memory requirements as possible. If users are requesting 192 GB and landing on the low-memory nodes, but only use 96 GB, then you also have resource underutilisation, and you probably have more low-memory nodes than high-memory nodes, so that might be a bigger problem. Regards Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de