On the cluster I've been managing we had a solution with pam_script that was choosing for each user two random cores and bounding his session to those (if this is second session use the same cores). I think it's quite good solution, since 1) User is not able to take all server resources 2) The probability that two users will be bound to the same resources is decreased (so one user will not affect others) . It can be optimized with change of two cores to any number that is optimal for the login node resources and number of users loged in.
Additionally to this we had simply cron job to notify admins when user process cputime is larger than 2 minutes and the difference between real time and cpu time is small. cheers, Marcin 2017-02-09 20:01 GMT+01:00 Ryan Novosielski <[email protected]>: > I have used ulimits in the past to limit users to 768MB of RAM per > process. This seemed to be enough to run anything they were actually > supposed to be running. I would use cgroups on a more modern (this was > RHEL5). > > A related question: we used cgroups on a CentOS 6 system, but then > switched our accounts to private user groups as opposed to a more general > "hpcusers" group. It doesn't seem like there is a way to use cgroups on a > secondary group, or any other easy way to do this. The setup was that the > main user group was limited to "most" of the machine and users were limited > to some percentage of the most. With users not sharing any group, this > stopped working. Anyone know of an alternative (I guess doing it based on > excluding system users and applying limits to everyone else, but this seems > hamfisted). > > -- > ____ > || \UTGERS, |---------------------------*O > *--------------------------- > ||_// the State | Ryan Novosielski - [email protected] > || \ University | Sr. Technologist - 973/972.0922 <(973)%20972-0922> > (2x0922) ~*~ RBHS Campus > || \ of NJ | Office of Advanced Research Computing - MSB C630, > Newark > `' > > On Feb 9, 2017, at 13:05, Ole Holm Nielsen <[email protected]> > wrote: > > We limit the cpu times in /etc/security/limits.conf so that user processes > have a maximum of 10 minutes. It doesn't eliminate the problem completely, > but it's fairly effective on users who misunderstood the role of login > nodes. > > > > On Thu, Feb 9, 2017 at 6:38 PM +0100, "Jason Bacon" <[email protected]> > wrote: > > We simply make it impossible to run computational software on the head >> nodes. >> >> 1. No scientific software packages are installed on the local disk. >> 2. Our NFS-mounted application directory is mounted with noexec. >> >> Regards, >> >> Jason >> >> On 02/09/17 07:09, John Hearns wrote: >> > >> > Does anyone have a good suggestion for this problem? >> > >> > On a cluster I am implementing I noticed a user is running a code on >> > 16 cores, on one of the login nodes, outside the batch system. >> > >> > What are the accepted techniques to combat this? Other than applying a >> > LART, if you all know what this means. >> > >> > On one system I set up a year or so ago I was asked to implement a >> > shell timeout, so if the user was idle for 30 minutes they would be >> > logged out. >> > >> > This actually is quite easy to set up as I recall. >> > >> > I guess in this case as the user is connected to a running process >> > then they are not ‘idle’. >> > >> > Any views or opinions presented in this email are solely those of the >> > author and do not necessarily represent those of the company. >> > Employees of XMA Ltd are expressly required not to make defamatory >> > statements and not to infringe or authorise any infringement of >> > copyright or any other legal right by email communications. Any such >> > communication is contrary to company policy and outside the scope of >> > the employment of the individual concerned. The company will not >> > accept any liability in respect of such communication, and the >> > employee responsible will be personally liable for any damages or >> > other liability arising. XMA Limited is registered in England and >> > Wales (registered no. 2051703). Registered Office: Wilford Industrial >> > Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP >> >> >> -- >> Earth is a beta site. >> >>
