Hi,

We use cgroups to limit usage to 3 cores and 4G of memory on the head nodes. I
didn't do it but will copy and paste in our documentation below.

Those limits, 3 cores are 4G are global to all non root users I think as they
apply to a group. We obviously don't do this on the nodes.

We also monitor system utilisation with nagios and will intervene if needed.
Before we had cgroups in place I very occasionally had to do a pkill -u baduser
and lock them out temporarily until the situation was explained to them.

Any questions please let me know.

Sean



===== How to configure Cgroups locally on a system =====

This is a step-to-step guide to configure Cgroups locally on a system.

==== 1. Install the libraries to control Cgroups and to enforce it via PAM ====

<code bash>$ yum install libcgroup libcgroup-pam</code>

==== 2. Load the Cgroups module on PAM ====

<code bash>
$ echo session    required    pam_cgroup.so>>/etc/pam.d/login
$ echo session    required    pam_cgroup.so>>/etc/pam.d/password-auth-ac
$ echo session    required    pam_cgroup.so>>/etc/pam.d/system-auth-ac
</code>

==== 3. Set the Cgroup limits and associate them to a user group ====

add to /etc/cgconfig.conf:
<code bash>
# cpuset.mems may be different in different architectures, e.g. in Parsons there
# is only "0".
group users {
  memory {
    memory.limit_in_bytes="4G";
    memory.memsw.limit_in_bytes="6G";
  }
  cpuset {
    cpuset.mems="0-1";
    cpuset.cpus="0-2";
  }
}
</code>

Note that the ''memory.memsw.limit_in_bytes'' limit is //inclusive// of the
''memory.limit_in_bytes'' limit. So in the above example, the limit is 4GB of
RAM following by a further 2 GB of swap. See:

[[https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html#proc-cpu_and_mem
]]

Set no limit for root and set limits for every other individual user:

<code bash>
$ echo "root    *      /">>/etc/cgrules.conf
$ echo "*   cpuset,memory    users">>/etc/cgrules.conf
</code>

Note also that the ''users'' cgroup defined above is inclusive of **all** users
(the * wildcard). So it is not a 4GB RAM limit for one user, it is a 4GB RAM
limit in total for every non-root user.

==== 4. Start the daemon that manages Cgroups configuration and set it to start
on boot ====

<code bash>
$ /etc/init.d/cgconfig start
$ chkconfig cgconfig on
</code>





On Thu, Feb 09, 2017 at 05:12:12AM -0800, John Hearns wrote:

> Does anyone have a good suggestion for this problem?
> 
> On a cluster I am implementing I noticed a user is running a code on 16 
> cores, on one of the login nodes, outside the batch system.
> What are the accepted techniques to combat this? Other than applying a LART, 
> if you all know what this means.
> 
> On one system I set up a year or so ago I was asked to implement a shell 
> timeout, so if the user was idle for 30 minutes they would be logged out.
> This actually is quite easy to set up as I recall.
> I guess in this case as the user is connected to a running process then they 
> are not 'idle'.
> 
> 
> Any views or opinions presented in this email are solely those of the author 
> and do not necessarily represent those of the company. Employees of XMA Ltd 
> are expressly required not to make defamatory statements and not to infringe 
> or authorise any infringement of copyright or any other legal right by email 
> communications. Any such communication is contrary to company policy and 
> outside the scope of the employment of the individual concerned. The company 
> will not accept any liability in respect of such communication, and the 
> employee responsible will be personally liable for any damages or other 
> liability arising. XMA Limited is registered in England and Wales (registered 
> no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
> Wilford, Nottingham, NG11 7EP

-- 
Sean McGrath M.Sc

Systems Administrator
Trinity Centre for High Performance and Research Computing
Trinity College Dublin

[email protected]

https://www.tcd.ie/
https://www.tchpc.tcd.ie/

+353 (0) 1 896 3725

Reply via email to