Hi all.
We've got some memory allocation/memory contention issues our users are
complaining about. Many are saying they can't get their jobs to run because of
memory resource issues.
An example:
scheduling info:
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-2-3.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-12.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-6.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-10.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-11.local" because it offers only hc:virtual_free=2.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-9.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-2-1.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-3.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-0.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-4.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-14.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-8.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-1-6.local" because it offers only hc:virtual_free=5.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-2-2.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-5.local" because it offers only hc:virtual_free=4.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-1-3.local" because it offers only hc:virtual_free=5.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-0-7.local" because it offers only hc:virtual_free=12.000G
(-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-1-5.local" because it offers only hc:virtual_free=5.000G
Another example, of a user who's job is successfully running:
hard resource_list: mem_free=100G
mail_list: xyz
notify: FALSE
job_name: mlmassoc_GRMi
stdout_path_list: NONE:NONE:/commented.out
jobshare: 0
env_list:
script_file: /commented.out
usage 1: cpu=2:08:09:22, mem=712416.09719 GBs, io=0.59519,
vmem=3.379G, maxvmem=4.124G
If I look at the qhost outputs:
[root@cluster ~]# qhost
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - - - - -
compute-0-0 lx26-amd64 24 6.49 94.6G 5.5G 0.0 0.0
compute-0-1 lx26-amd64 24 10.71 94.6G 5.9G 0.0 0.0
compute-0-10 lx26-amd64 24 6.09 94.6G 5.1G 0.0 0.0
compute-0-11 lx26-amd64 24 6.10 94.6G 5.5G 0.0 0.0
compute-0-12 lx26-amd64 24 6.12 94.6G 8.1G 0.0 0.0
compute-0-13 lx26-amd64 24 8.41 94.6G 5.3G 0.0 0.0
compute-0-14 lx26-amd64 24 7.32 94.6G 7.6G 0.0 0.0
compute-0-15 lx26-amd64 24 10.42 94.6G 6.3G 0.0 0.0
compute-0-2 lx26-amd64 24 9.67 94.6G 5.5G 0.0 0.0
compute-0-3 lx26-amd64 24 7.17 94.6G 5.5G 0.0 0.0
compute-0-4 lx26-amd64 24 6.13 94.6G 4.0G 996.2M 27.5M
compute-0-5 lx26-amd64 24 6.36 94.6G 5.4G 0.0 0.0
compute-0-6 lx26-amd64 24 6.35 94.6G 6.4G 0.0 0.0
compute-0-7 lx26-amd64 24 8.08 94.6G 6.0G 0.0 0.0
compute-0-8 lx26-amd64 24 6.12 94.6G 8.4G 0.0 0.0
compute-0-9 lx26-amd64 24 6.12 94.6G 5.9G 0.0 0.0
compute-1-0 lx26-amd64 80 30.13 378.7G 36.2G 0.0 0.0
compute-1-1 lx26-amd64 80 28.93 378.7G 21.8G 996.2M 168.1M
compute-1-2 lx26-amd64 80 29.84 378.7G 23.2G 996.2M 46.8M
compute-1-3 lx26-amd64 80 27.03 378.7G 24.4G 996.2M 39.3M
compute-1-4 lx26-amd64 80 28.05 378.7G 23.2G 996.2M 122.0M
compute-1-5 lx26-amd64 80 27.47 378.7G 23.5G 996.2M 161.4M
compute-1-6 lx26-amd64 80 25.07 378.7G 25.6G 996.2M 91.5M
compute-1-7 lx26-amd64 80 26.98 378.7G 22.8G 996.2M 115.9M
compute-2-0 lx26-amd64 32 11.03 47.2G 2.6G 1000.0M 67.1M
compute-2-1 lx26-amd64 32 8.35 47.2G 3.7G 1000.0M 11.4M
compute-2-2 lx26-amd64 32 10.10 47.2G 1.7G 1000.0M 126.5M
compute-2-3 lx26-amd64 32 7.02 47.2G 3.4G 1000.0M 11.3M
So, it would seem to me we've got plenty of actual resources free, but our
virtual_free complex seems to be doing something funny/misguided?
I'm worried that our virtual_free complex might actually be doing more harm
than god here
Here is an example of some qhost –F output on two different node types:
compute-2-3 lx26-amd64 32 7.00 47.2G 3.4G 1000.0M 11.3M
hl:arch=lx26-amd64
hl:num_proc=32.000000
hl:mem_total=47.187G
hl:swap_total=999.992M
hl:virtual_total=48.163G
hl:load_avg=7.000000
hl:load_short=7.000000
hl:load_medium=7.000000
hl:load_long=7.060000
hl:mem_free=43.788G
hl:swap_free=988.703M
hc:virtual_free=4.000G
hl:mem_used=3.398G
hl:swap_used=11.289M
hl:virtual_used=3.409G
hl:cpu=6.400000
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_socket=2.000000
hl:m_core=16.000000
hl:np_load_avg=0.218750
hl:np_load_short=0.218750
hl:np_load_medium=0.218750
hl:np_load_long=0.220625
compute-1-7 lx26-amd64 80 27.83 378.7G 22.8G 996.2M 115.9M
hl:arch=lx26-amd64
hl:num_proc=80.000000
hl:mem_total=378.652G
hl:swap_total=996.207M
hl:virtual_total=379.624G
hl:load_avg=27.830000
hl:load_short=29.050000
hl:load_medium=27.830000
hl:load_long=27.360000
hl:mem_free=355.814G
hl:swap_free=880.266M
hc:virtual_free=13.000G
hl:mem_used=22.838G
hl:swap_used=115.941M
hl:virtual_used=22.951G
hl:cpu=33.600000
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
hl:m_socket=4.000000
hl:m_core=40.000000
hl:np_load_avg=0.347875
hl:np_load_short=0.363125
hl:np_load_medium=0.347875
hl:np_load_long=0.342000
Our virtual free complex, designated as memory complex, relation <=, is request
able, is set as a consumable and has a default of 2.
I guess what I'd like to aim for is some sane memory management and a way of
setting up some "rules" for my users so they can allocate sensible amounts of
RAM, that reflect really what the hosts/execution nodes are capable of.
I've got (unfortunately!) three types of nodes in the one queue. One type has
384GB of RAM. One type has 96GB of RAM. One type has 48GB of RAM.
Are my users just expecting too much? Are there some caps/resource limits I
should put in place to manage expectations or simplyinvest in some "big memory"
nodes for really large jobs and make a separate highmem.q for such tasks?
You'll see above some users have tried asking for 100GB as the mem_free complex
is used.
Thoughts/experiences/ideas?
Thanks for your time, all.
--JC
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users