The only thing that comes to mind is your accounting database is down and slurmctld is storing all the data in memory.
Quoting Mario Kadastik <[email protected]>: > >> Valgrind will slow slurm down by a lot. You'll probably want to run >> this on some test system or during a test time. > > Can't ... this effect doesn't pop up fast. And a test system > couldn't be run on such scale. > >> There are too many factors to say how large slurmctld should be, but >> 1+ GB is probably too large. > > Well then I guess this is bad: > > [root@slurm-1 ~]# ps -eao pid,user,rss,cmd|grep slurm > 21613 slurm 6735956 /usr/sbin/slurmctld > > it's already using 6.4GB of RSS... > > Mario Kadastik, PhD > Researcher > > --- > "Physics is like sex, sure it may have practical reasons, but > that's not why we do it" > -- Richard P. Feynman >
