TaskPlugin=task/cgroup
SelectTypeParameters=CR_Core_Memory
JobAcctGatherFrequency=15
JobAcctGatherType=jobacct_gather/linux
cgroup.conf:
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=yes
ConstrainRAMSpace=yes
Thank you!
*--Felip Moll Marquès*
Comput
="/etc/slurm/cgroup"
ConstrainCores=yes
ConstrainRAMSpace=yes
Thank you!
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
process 18956 (R) total-vm:53184568kB,
anon-rss:45767588kB, file-rss:5447628kB
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
2015-12-15 10:50 GMT+01:00 Bjørn-Helge Mevik :
>
> Felip Moll writes:
>
> > On one
1154 372
80 0 gzip
nov 25 15:47:05 cn23 kernel: Memory cgroup out of memory: Kill process
15011 (ocean_pp.bash) score 0 or sacrifice child
nov 25 15:47:05 cn23 kernel: Killed process 32554 (gzip) total-vm:4616kB,
anon-rss:216kB, file-rss:1272kB
*--Felip Moll Marquès*
Co
lip M
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
2015-12-18 15:09 GMT+01:00 Bjørn-Helge Mevik :
>
> Carlos Fenoy writes:
>
> > Barbara, I don't think that is the issue here. The killer is the OOM not
&
is plugin and have the same
problem.
Regards,
Felip M
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
urmctld daemons.
Regards,
Felip M
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
2016-02-03 20:25 GMT+01:00 Cooper, Trevor :
>
> Jeff,
>
> You might want to start with the Slurm overview page[1] and quick start
> admin guid
o you suggest me to do to set the environment with this info?
He need this information to configure on run-time the heap, etc.
Regards,
Felip M
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
1.
To do this, actually values of tres_alloc column must be parse externaly to
mysql.
Regards,
Felip M
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
A very ugly workaround for having cpus_alloc column would be:
create view custom_job_table as
select *,substring_index(substring_index(tres_alloc,',',1),'1=',-1) as
cpus_alloc
from _job_table;
*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - lip...@gmai
o lowprio,
wallclock is switched to 7 days.
In my opinion, timelimit should not be changed when updating qos job
unless explicitly told.
Is it a correct behaviour in slurm 15.08.10?
--
Felip Moll Marquès
Computer Science Engineer
E-Mail - lip...@gmail.com
WebPage - http://lipix.ciutadella.es
SwitchName=cmc1 Nodes=nva[1-9]
SwitchName=cmc2 Nodes=nva[10-18]
SwitchName=cmc3 Nodes=nva[19-27]
SwitchName=cmc4 Nodes=nva[28-36]
SwitchName=cmc5 Nodes=nva[37-45]
SwitchName=cmc6 Nodes=nva[46-54]
SwitchName=cmc7 Nodes=nva[55-61]
SwitchName=ibswfdr Nodes=nvb[1-39]
SwitchName=troncal Switches=cmc[1-7],ibsw
Do you have any kind of firewall in your network?
I would suggest it is a problem with dates but since you tested munge -n we
could discard that.
Can you anyway do a pdsh -w compute-* date |dshbak -c ?
Can you show nodes slurmd log output?
*--Felip Moll Marquès*
Computer Science Engineer
E
It is not possible, at least in a supported way.
The first requirement of the admin guide tells:
1. Make sure the clocks, users and groups (UIDs and GIDs) are
synchronized across the cluster.
From:
https://slurm.schedmd.com/quickstart_admin.html
*--Felip Moll Marquès*
Computer
I do it in the epilog.
When entering the epilog for the last job on the node, I do a drop caches,
clean /dev/shm, etc.
Since epilog is run as root there is no need to use sudo. You could do that
also on the prolog, just check if there are other jobs running on the node.
br
Felip M
On 5 Apr 2017
6) Current Job Memory allocation for nodes
I am currently looking for options in sstat, sinfo, scontrol.. but I can't
find how to see the total reserved memory for one particular node.
In sview, "nodes" tab, you can see how many cpus are used/free for each
node, but not how many memory.
Thks!.
This is my scontrol show node:
NodeName=pez015 Arch=x86_64 CoresPerSocket=6
CPUAlloc=12 CPUErr=0 CPUTot=12 Features=(null)
Gres=(null)
NodeAddr=pez015 NodeHostName=pez015
OS=Linux RealMemory=48128 Sockets=2
State=ALLOCATED ThreadsPerCore=1 TmpDisk=61440 Weight=1
BootTime=2013-03-
This will give you the computer cores allocated (#3):
scontrol show node | grep CPUAlloc | cut -d" " -f 4 | sed 's/CPUAlloc=//g'
| awk '{total = total + $1}END{print total}'
But of course not very useful.. I would love also a good summary tool.
2013/4/26 Mario Kadastik
>
> The thread has so
18 matches
Mail list logo