Hi Jake,
You can do 'qhost -F h_vmem,mem_free,virtual_free', that might be a
useful view for you.
In general, I've only ever used one of the three complexes above.
Which one(s) do you have defined for the execution hosts? e.g.
qconf -se compute-1-7
h_vmem will map to 'ulimit -v'
mem_free just tracks 'free'
virtual_free I'm not sure, I'd have to search the mailing list archives.
I recommend you just use one of those three complexes. If you want to
set a hard memory limit for jobs, use h_vmem. If you want to just
suggest to the scheduler, use mem_free, it will use the current
instantaneous mem_free level during job scheduling (well, the lower of
the consumable mem_free (if you havve that defined) and the actual
current mem_free).
What is the compelling reason to use virtual_free? I guess it includes
swap?
Regards,
Alex
On 12/7/12 2:31 AM, Jake Carroll wrote:
Hi all.
We've got some memory allocation/memory contention issues our users are
complaining about. Many are saying they can't get their jobs to run
because of memory resource issues.
An example:
scheduling info:
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-2-3.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-12.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-6.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-10.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-11.local" because it offers only hc:virtual_free=2.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-9.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-2-1.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-3.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-0.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-4.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-14.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-8.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-6.local" because it offers only hc:virtual_free=5.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-2-2.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-5.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-3.local" because it offers only hc:virtual_free=5.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-7.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-5.local" because it offers only hc:virtual_free=5.000G
Another example, of a user who's job is successfully running:
hard resource_list: mem_free=100G
mail_list: xyz
notify: FALSE
job_name: mlmassoc_GRMi
stdout_path_list: NONE:NONE:/commented.out
jobshare: 0
env_list:
script_file: /commented.out
usage 1: cpu=2:08:09:22, mem=712416.09719 GBs,
io=0.59519, vmem=3.379G, maxvmem=4.124G
If I look at the qhost outputs:
[root@cluster ~]# qhost
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO
SWAPUS
-------------------------------------------------------------------------------
global - - - - - -
-
compute-0-0 lx26-amd64 24 6.49 94.6G 5.5G 0.0
0.0
compute-0-1 lx26-amd64 24 10.71 94.6G 5.9G 0.0
0.0
compute-0-10 lx26-amd64 24 6.09 94.6G 5.1G 0.0
0.0
compute-0-11 lx26-amd64 24 6.10 94.6G 5.5G 0.0
0.0
compute-0-12 lx26-amd64 24 6.12 94.6G 8.1G 0.0
0.0
compute-0-13 lx26-amd64 24 8.41 94.6G 5.3G 0.0
0.0
compute-0-14 lx26-amd64 24 7.32 94.6G 7.6G 0.0
0.0
compute-0-15 lx26-amd64 24 10.42 94.6G 6.3G 0.0
0.0
compute-0-2 lx26-amd64 24 9.67 94.6G 5.5G 0.0
0.0
compute-0-3 lx26-amd64 24 7.17 94.6G 5.5G 0.0
0.0
compute-0-4 lx26-amd64 24 6.13 94.6G 4.0G 996.2M
27.5M
compute-0-5 lx26-amd64 24 6.36 94.6G 5.4G 0.0
0.0
compute-0-6 lx26-amd64 24 6.35 94.6G 6.4G 0.0
0.0
compute-0-7 lx26-amd64 24 8.08 94.6G 6.0G 0.0
0.0
compute-0-8 lx26-amd64 24 6.12 94.6G 8.4G 0.0
0.0
compute-0-9 lx26-amd64 24 6.12 94.6G 5.9G 0.0
0.0
compute-1-0 lx26-amd64 80 30.13 378.7G 36.2G 0.0
0.0
compute-1-1 lx26-amd64 80 28.93 378.7G 21.8G 996.2M
168.1M
compute-1-2 lx26-amd64 80 29.84 378.7G 23.2G 996.2M
46.8M
compute-1-3 lx26-amd64 80 27.03 378.7G 24.4G 996.2M
39.3M
compute-1-4 lx26-amd64 80 28.05 378.7G 23.2G 996.2M
122.0M
compute-1-5 lx26-amd64 80 27.47 378.7G 23.5G 996.2M
161.4M
compute-1-6 lx26-amd64 80 25.07 378.7G 25.6G 996.2M
91.5M
compute-1-7 lx26-amd64 80 26.98 378.7G 22.8G 996.2M
115.9M
compute-2-0 lx26-amd64 32 11.03 47.2G 2.6G 1000.0M
67.1M
compute-2-1 lx26-amd64 32 8.35 47.2G 3.7G 1000.0M
11.4M
compute-2-2 lx26-amd64 32 10.10 47.2G 1.7G 1000.0M
126.5M
compute-2-3 lx26-amd64 32 7.02 47.2G 3.4G 1000.0M
11.3M
So, it would seem to me we've got _plenty_ of actual resources free, but
our virtual_free complex seems to be doing something funny/misguided?
I'm worried that our virtual_free complex might actually be doing more
harm than god here
Here is an example of some qhost –F output on two different node types:
compute-2-3 lx26-amd64 32 7.00 47.2G 3.4G 1000.0M
11.3M
hl:arch=lx26-amd64
hl:num_proc=32.000000
hl:mem_total=47.187G
hl:swap_total=999.992M
hl:virtual_total=48.163G
hl:load_avg=7.000000
hl:load_short=7.000000
hl:load_medium=7.000000
hl:load_long=7.060000
hl:mem_free=43.788G
hl:swap_free=988.703M
hc:virtual_free=4.000G
hl:mem_used=3.398G
hl:swap_used=11.289M
hl:virtual_used=3.409G
hl:cpu=6.400000
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_socket=2.000000
hl:m_core=16.000000
hl:np_load_avg=0.218750
hl:np_load_short=0.218750
hl:np_load_medium=0.218750
hl:np_load_long=0.220625
compute-1-7 lx26-amd64 80 27.83 378.7G 22.8G 996.2M
115.9M
hl:arch=lx26-amd64
hl:num_proc=80.000000
hl:mem_total=378.652G
hl:swap_total=996.207M
hl:virtual_total=379.624G
hl:load_avg=27.830000
hl:load_short=29.050000
hl:load_medium=27.830000
hl:load_long=27.360000
hl:mem_free=355.814G
hl:swap_free=880.266M
hc:virtual_free=13.000G
hl:mem_used=22.838G
hl:swap_used=115.941M
hl:virtual_used=22.951G
hl:cpu=33.600000
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_socket=4.000000
hl:m_core=40.000000
hl:np_load_avg=0.347875
hl:np_load_short=0.363125
hl:np_load_medium=0.347875
hl:np_load_long=0.342000
Our virtual free complex, designated as memory complex, relation <=, is
request able, is set as a consumable and has a default of 2.
I guess what I'd like to aim for is some sane memory management and a
way of setting up some "rules" for my users so they can allocate
sensible amounts of RAM, that reflect really what the hosts/execution
nodes are capable of.
I've got (unfortunately!) three types of nodes in the one queue. One
type has 384GB of RAM. One type has 96GB of RAM. One type has 48GB of RAM.
Are my users just expecting too much? Are there some caps/resource
limits I should put in place to manage expectations or simplyinvest in
some "big memory" nodes for really large jobs and make a separate
highmem.q for such tasks? You'll see above some users have tried asking
for 100GB as the mem_free complex is used.
Thoughts/experiences/ideas?
Thanks for your time, all.
--JC
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users