On Mon, Oct 7, 2013 at 9:25 AM, Edrisse Chermak <[email protected]> wrote: > > Dear Grid Engine developers and users, > > I would like to prevent jobs running when node memory is almost filled. > Here is the typical situation I have: > > 'qhost': > ================================================================ > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE > ---------------------------------------------------------------- > global - - - - - > node1 linux-x64 64 12.00 52.4G 42.8G > node2 linux-x64 64 24.00 52.4G 12.8G > =============================================================== > > The memory on node1 is almost filled, so I would like the new job I'll > launch to go on node2. (provided that the job I want to launch requires > 2 CPUs and that I configured np_load_avg=0.80, so that CPU load doesn't > matter here). >
We have cases where we are constrained by memory rather than cpu count. We solved this with changes to complexes. We made h_vmem consumable with a default tuned for our environment (with qconf -mc) and then set per node values for h_vmem equal to a couple GB less than total physical ram (with qconf -me nodename or global). We also aliased h_vmem to mem and got users to specify -l mem=xG on the qsub command line when they need more than the default amount of memory. This prevents us from over committing memory. There may also be a way to accomplish this with RQS or tweaking load_formula. Best, Chris _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
