Hi,
We use cgroups to limit usage to 3 cores and 4G of memory on the head nodes. I
didn't do it but will copy and paste in our documentation below.
Those limits, 3 cores are 4G are global to all non root users I think as they
apply to a group. We obviously don't do this on the nodes.
We also monitor system utilisation with nagios and will intervene if needed.
Before we had cgroups in place I very occasionally had to do a pkill -u baduser
and lock them out temporarily until the situation was explained to them.
Any questions please let me know.
Sean
===== How to configure Cgroups locally on a system =====
This is a step-to-step guide to configure Cgroups locally on a system.
==== 1. Install the libraries to control Cgroups and to enforce it via PAM ====
<code bash>$ yum install libcgroup libcgroup-pam</code>
==== 2. Load the Cgroups module on PAM ====
<code bash>
$ echo session required pam_cgroup.so>>/etc/pam.d/login
$ echo session required pam_cgroup.so>>/etc/pam.d/password-auth-ac
$ echo session required pam_cgroup.so>>/etc/pam.d/system-auth-ac
</code>
==== 3. Set the Cgroup limits and associate them to a user group ====
add to /etc/cgconfig.conf:
<code bash>
# cpuset.mems may be different in different architectures, e.g. in Parsons there
# is only "0".
group users {
memory {
memory.limit_in_bytes="4G";
memory.memsw.limit_in_bytes="6G";
}
cpuset {
cpuset.mems="0-1";
cpuset.cpus="0-2";
}
}
</code>
Note that the ''memory.memsw.limit_in_bytes'' limit is //inclusive// of the
''memory.limit_in_bytes'' limit. So in the above example, the limit is 4GB of
RAM following by a further 2 GB of swap. See:
[[https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html#proc-cpu_and_mem
]]
Set no limit for root and set limits for every other individual user:
<code bash>
$ echo "root * /">>/etc/cgrules.conf
$ echo "* cpuset,memory users">>/etc/cgrules.conf
</code>
Note also that the ''users'' cgroup defined above is inclusive of **all** users
(the * wildcard). So it is not a 4GB RAM limit for one user, it is a 4GB RAM
limit in total for every non-root user.
==== 4. Start the daemon that manages Cgroups configuration and set it to start
on boot ====
<code bash>
$ /etc/init.d/cgconfig start
$ chkconfig cgconfig on
</code>
On Thu, Feb 09, 2017 at 05:12:12AM -0800, John Hearns wrote:
> Does anyone have a good suggestion for this problem?
>
> On a cluster I am implementing I noticed a user is running a code on 16
> cores, on one of the login nodes, outside the batch system.
> What are the accepted techniques to combat this? Other than applying a LART,
> if you all know what this means.
>
> On one system I set up a year or so ago I was asked to implement a shell
> timeout, so if the user was idle for 30 minutes they would be logged out.
> This actually is quite easy to set up as I recall.
> I guess in this case as the user is connected to a running process then they
> are not 'idle'.
>
>
> Any views or opinions presented in this email are solely those of the author
> and do not necessarily represent those of the company. Employees of XMA Ltd
> are expressly required not to make defamatory statements and not to infringe
> or authorise any infringement of copyright or any other legal right by email
> communications. Any such communication is contrary to company policy and
> outside the scope of the employment of the individual concerned. The company
> will not accept any liability in respect of such communication, and the
> employee responsible will be personally liable for any damages or other
> liability arising. XMA Limited is registered in England and Wales (registered
> no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane,
> Wilford, Nottingham, NG11 7EP
--
Sean McGrath M.Sc
Systems Administrator
Trinity Centre for High Performance and Research Computing
Trinity College Dublin
[email protected]
https://www.tcd.ie/
https://www.tchpc.tcd.ie/
+353 (0) 1 896 3725