We actually run CentOS 6 as well, and haven't seen this problem, though
maybe our users haven't done anything as untoward as yours. We do have a
bunch of bioinformatics code (including Java) so I thought we would have
seen the worst cases.

On Thu, Aug 29, 2019 at 10:50:27AM -0400, Mike Serkov wrote:
> Load average indeed. The thing is that if, we have a parallel process bound 
> to one core, the kernel scheduler has to constantly switch those threads from 
> running to sleeping state and back and do context switch which creates some 
> overhead on the system itself. Imagine you have 64CPU box, each core runs 
> such a job, and every job spawns 64 threads ( which is a usual case, as many 
> tools just do system call to identify amount of cpus they can use by default 
> ). In both cases with affinity forced and without - it is not a good 
> situation. In case of affinity is enforced, in extreme cases we had nodes 
> just frozen, especially in cases when heavy I/O also a case ( probably 
> because of overhead on the kernel scheduler ). It was on RHEL6, maybe on 
> modern kernels it is much better. All I want to say is that unlike memory 
> limitations with cgroups, when you are actually sure that process can???t 
> allocate more, with cpusets it is a bit different. Users still can run as 
> many parallel proces!
 ses as they want. They are limited to a number of physical CPUS, but still it 
may affect the node and other jobs. 
> 
> Best regards,
> Mikhail Serkov 
> 
> > On Aug 29, 2019, at 10:20 AM, Skylar Thompson <skyl...@uw.edu> wrote:
> > 
> > Load average gets high if the job spawns more processes/threads than
> > allocated CPUs, but we haven't seen any problem with node instability. We
> > did have to remove np_load_avg from load_thresholds, though, to keep our
> > users from DoS'ing the cluster...
> > 
> >> On Thu, Aug 29, 2019 at 05:27:36AM -0400, Mike Serkov wrote:
> >> Also, something to keep in mind - cgroups will not solve this issue 
> >> completely. It is just affinity enforcement. If the job spawns multiple 
> >> threads and they all active - it will cause LA growing as well as some 
> >> other side effects, regardless affinity setting. On big SMP boxes it may 
> >> actually cause more instability. Anyway, jobs should be configured to use 
> >> exact amount of threads they request, and it should be monitored.
> >> 
> >> Best regards,
> >> Mikhail Serkov 
> >> 
> >>> On Aug 29, 2019, at 4:16 AM, Ondrej Valousek 
> >>> <ondrej.valou...@adestotech.com> wrote:
> >>> 
> >>> Also a quick note: cgroups is the way to _enforce_ CPU affinity.
> >>> For vast majority of the jobs, I would say just a simple taskset 
> >>> configuration (i.e. i.e. something like ???-l binding linear???) would do 
> >>> as well.
> >>> 
> >>> 
> >>> From: Dietmar Rieder <dietmar.rie...@i-med.ac.at> 
> >>> Sent: Thursday, August 29, 2019 9:37 AM
> >>> To: users@gridengine.org; Ondrej Valousek 
> >>> <ondrej.valou...@adestotech.com>; users <users@gridengine.org>
> >>> Subject: Re: [gridengine users] limit CPU/slot resource to the number of 
> >>> reserved slots
> >>> 
> >>> Great, thanks so much!
> >>> 
> >>> Dietmar
> >>> 
> >>> Am 29. August 2019 09:05:35 MESZ schrieb Ondrej Valousek 
> >>> <ondrej.valou...@adestotech.com>:
> >>> Nope,
> >>> SoGE (as of 8.1.9) supports CGROUPS w/o any code changes, just add 
> >>> ???USE_CGROUPS=yes??? to the exec parameter list to make shepherd use 
> >>> CGroup saveset controller.
> >>> My path only extends it to supports system and hence possibility to hard 
> >>> enforce memory/cpu limits, etc???
> >>> Hth,
> >>> Ondrej
> >>> 
> >>> From: Daniel Povey <dpo...@gmail.com> 
> >>> Sent: Monday, August 26, 2019 10:12 PM
> >>> To: Dietmar Rieder <dietmar.rie...@i-med.ac.at>; Ondrej Valousek 
> >>> <ondrej.valou...@adestotech.com>; users <users@gridengine.org>
> >>> Subject: Re: [gridengine users] limit CPU/slot resource to the number of 
> >>> reserved slots
> >>> 
> >>> I don't think it's supported in Son of GridEngine.  Ondrej Valousek 
> >>> (cc'd) described in the first thread here
> >>> http://arc.liv.ac.uk/pipermail/sge-discuss/2019-August/thread.html
> >>> how he was able to implement it, but it required code changes, i.e. you 
> >>> would need to figure out how to build and install SGE from source, which 
> >>> is a task in itself.
> >>> 
> >>> Dan
> >>> 
> >>> 
> >>> On Mon, Aug 26, 2019 at 12:46 PM Dietmar Rieder 
> >>> <dietmar.rie...@i-med.ac.at> wrote:
> >>> Hi,
> >>> 
> >>> thanks for your reply. This sounds promising.
> >>> We are using Son of Grid Engine though. Can you point me to the right
> >>> docs to get cgroup enabled in the exec host (CentOS 7). I must admit I
> >>> have no experience with cgroups.
> >>> 
> >>> Thanks again
> >>>  Dietmar
> >>> 
> >>>> On 8/26/19 4:03 PM, Skylar Thompson wrote:
> >>>> At least for UGE, you will want to use the CPU set integration, which 
> >>>> will
> >>>> assign the job to a cgroup that has one CPU per requested slot. Once you
> >>>> have cgroups enabled in the exec host OS, you can then set these options 
> >>>> in
> >>>> sge_conf:
> >>>> 
> >>>> cgroup_path=/cgroup
> >>>> cpuset=1
> >>>> 
> >>>> You can use this mechanism to have the m_mem_free request enforced as 
> >>>> well.
> >>>> 
> >>>>> On Mon, Aug 26, 2019 at 02:15:22PM +0200, Dietmar Rieder wrote:
> >>>>> Hi,
> >>>>> 
> >>>>> may be this is a stupid question, but I'd like to limit the used/usable
> >>>>> number of cores to the number of slots that were reserved for a job.
> >>>>> 
> >>>>> We often see that people reserve 1 slot, e.g. "qsub -pe smp 1 [...]"
> >>>>> but their program is then running in parallel on multiple cores. How can
> >>>>> this be prevented? Is it possible that with reserving only one slot a
> >>>>> process can not utilize more than this?
> >>>>> 
> >>>>> I was told the this should be possible in slurm (which we don't have,
> >>>>> and to which we don't want to switch to currently).
> >>>>> 
> >>>>> Thanks
> >>>>>  Dietmar
> >>>> 
> >>> 
> >>> 
> >>> -- 
> >>> _________________________________________
> >>> D i e t m a r  R i e d e r, Mag.Dr.
> >>> Innsbruck Medical University
> >>> Biocenter - Institute of Bioinformatics
> >>> Email: dietmar.rie...@i-med.ac.at
> >>> Web:   http://www.icbi.at
> >>> 
> >>> 
> >>> _______________________________________________
> >>> users mailing list
> >>> users@gridengine.org
> >>> https://gridengine.org/mailman/listinfo/users
> >>> 
> >>> --
> >>> D i e t m a r R i e d e r, Mag.Dr.
> >>> Innsbruck Medical University
> >>> Biocenter - Institute of Bioinformatics
> >>> Innrain 80, 6020 Innsbruck
> >>> Phone: +43 512 9003 71402
> >>> Fax: +43 512 9003 73100
> >>> Email: dietmar.rie...@i-med.ac.at
> >>> Web: http://www.icbi.at
> >>> _______________________________________________
> >>> users mailing list
> >>> users@gridengine.org
> >>> https://gridengine.org/mailman/listinfo/users
> > 
> >> _______________________________________________
> >> users mailing list
> >> users@gridengine.org
> >> https://gridengine.org/mailman/listinfo/users
> > 
> > 
> > -- 
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> > _______________________________________________
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to