[slurm-users] missing sysconfigdir in the prefix when built manually

2019-08-05 Thread Fred Liu
Hi,

I have tried manually building both 19.05 and 18.08 and found there is no 
sysconfdir(prefix/etc) in the installation path. And I have to copy it from the 
build folder.
Is it normal?

Thanks.

Fred


Re: [slurm-users] Using the manager as compute Node

2019-08-05 Thread Christopher Samuel

On 8/5/19 8:00 AM, wodel youchi wrote:

Do I have to declare it, for example with 10 CPUs and 32Gb of RAM to 
save the rest for the management, or will slurmctld take that in hand?


You will need both to declare it and also use cgroups to enforce it so 
that processes can't overrun that limit.


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



[slurm-users] Using the manager as compute Node

2019-08-05 Thread wodel youchi
Hi,

When using the slurm manager as a compute node, what is the best way to
define this node so that it's resources will not be exhausted.

Suppose I have a slurm manager with 20 CPUs and 64Gb of RAM and I want to
use it as a compute node without consuming all it's resources.

Do I have to declare it, for example with 10 CPUs and 32Gb of RAM to save
the rest for the management, or will slurmctld take that in hand?


Regards.


[slurm-users] sacctmgr dump question - how can I dump entities other than cluster?

2019-08-05 Thread Daniel Letai

  
  
The documentation clearly states 

dump  
  
Dump cluster data to the specified file. If the filename is not
specified
it uses clustername.cfg filename by default.
  

However, the only entity sacctmgr dump seems to accept is
  .
Glancing over the code at
https://github.com/SchedMD/slurm/blob/master/src/sacctmgr/cluster_functions.c#L1006

it doesn't seem like sacctmgr will accept anything other than a
  cluster name either.


How can I easily dump to file qos rules, in a way that would
  allow me to modify and upload new qos as required?


BTW, just noticed "archive" is not in the 'commands' section of
  sacctmgr man, but is treated as a command in later sections of the
  man page.

  




Re: [slurm-users] Slurm configuration

2019-08-05 Thread Daniel Letai

  
  
Hi.


On 8/3/19 12:37 AM, Sistemas NLHPC
  wrote:


  
  Hi all,

Currently we have two types of nodes, one with 192GB and another
with 768GB of RAM, it is required that in nodes of 768 GB it is
not allowed to execute tasks with less than 192GB, to avoid
underutilization of resources.

This, because we have nodes that can fulfill the condition of
executing tasks with 192GB or less.

Is it possible to use some slurm configuration to solve this
problem?
  

Easiest would be to use features/constraints. In slurm.conf add
NodeName=DEFAULT RealMemory=196608 Features=192GB Weight=1

NodeName=... (list all nodes with 192GB)
NodeName=DEFAULT RealMemory=786432 Features=768GB Weight=2
NodeName=... (list all nodes with 768GB)


And to run jobs only on node with 192GB in sbatch do
sbatch -C 192GB ...


To run jobs on all nodes, simply don't add the constraint to the
  sbatch line, and due to lower weight jobs should prefer to start
  on the 192GB nodes.


  
PD: All users can submit jobs on all nodes

Thanks in advance 

Regards.

  

  




Re: [slurm-users] Slurm configuration

2019-08-05 Thread Loris Bennett
Hi NLHPC employee,

Sistemas NLHPC  writes:

> Hi all,
>
> Currently we have two types of nodes, one with 192GB and another with
> 768GB of RAM, it is required that in nodes of 768 GB it is not allowed
> to execute tasks with less than 192GB, to avoid underutilization of
> resources.
>
> This, because we have nodes that can fulfill the condition of
> executing tasks with 192GB or less.
> Is it possible to use some slurm configuration to solve this problem?
>
> PD: All users can submit jobs on all nodes

Bear in mind that, you could have a situation in which the nodes for 192
GB or less and the 768 GB nodes are empty.  If then jobs are submitted
which require less than 192 GB are submitted, they will have to wait and
you will have underutilisation of your resources.

Rather than excluding the low-memory jobs completely from the
high-memory nodes, you could weight the low-memory nodes such that jobs
preferentially start there.  You could have a shorter timelimit for
low-memory jobs on high-memory nodes.

In my experience it is best to encourage and assist users to estimate
their memory requirements as possible.  If users are requesting 192 GB
and landing on the low-memory nodes, but only use 96 GB, then you also
have resource underutilisation, and you probably have more low-memory
nodes than high-memory nodes, so that might be a bigger problem.

Regards

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de