[gridengine users] Memory allocation woes. Any thoughts?

Jake Carroll Tue, 11 Dec 2012 11:07:52 -0800

Hi all.

We've got some memory allocation/memory contention issues our users are 
complaining about. Many are saying they can't get their jobs to run because of 
memory resource issues.


An example:

scheduling info:
            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-2-3.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-12.local" because it offers only hc:virtual_free=12.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-6.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-10.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-11.local" because it offers only hc:virtual_free=2.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-9.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-2-1.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-3.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-0.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-4.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-14.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-8.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-1-6.local" because it offers only hc:virtual_free=5.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-2-2.local" because it offers only hc:virtual_free=12.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-5.local" because it offers only hc:virtual_free=4.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-1-3.local" because it offers only hc:virtual_free=5.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-0-7.local" because it offers only hc:virtual_free=12.000G
                            (-l h_vmem=24G,virtual_free=24G) cannot run at host 
"compute-1-5.local" because it offers only hc:virtual_free=5.000G

Another example, of a user who's job is successfully running:

hard resource_list:         mem_free=100G
mail_list:                  xyz
notify:                     FALSE
job_name:                   mlmassoc_GRMi
stdout_path_list:           NONE:NONE:/commented.out
jobshare:                   0
env_list:
script_file:                /commented.out
usage    1:                 cpu=2:08:09:22, mem=712416.09719 GBs, io=0.59519, 
vmem=3.379G, maxvmem=4.124G

If I look at the qhost outputs:

[root@cluster ~]# qhost
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
compute-0-0             lx26-amd64     24  6.49   94.6G    5.5G     0.0     0.0
compute-0-1             lx26-amd64     24 10.71   94.6G    5.9G     0.0     0.0
compute-0-10            lx26-amd64     24  6.09   94.6G    5.1G     0.0     0.0
compute-0-11            lx26-amd64     24  6.10   94.6G    5.5G     0.0     0.0
compute-0-12            lx26-amd64     24  6.12   94.6G    8.1G     0.0     0.0
compute-0-13            lx26-amd64     24  8.41   94.6G    5.3G     0.0     0.0
compute-0-14            lx26-amd64     24  7.32   94.6G    7.6G     0.0     0.0
compute-0-15            lx26-amd64     24 10.42   94.6G    6.3G     0.0     0.0
compute-0-2             lx26-amd64     24  9.67   94.6G    5.5G     0.0     0.0
compute-0-3             lx26-amd64     24  7.17   94.6G    5.5G     0.0     0.0
compute-0-4             lx26-amd64     24  6.13   94.6G    4.0G  996.2M   27.5M
compute-0-5             lx26-amd64     24  6.36   94.6G    5.4G     0.0     0.0
compute-0-6             lx26-amd64     24  6.35   94.6G    6.4G     0.0     0.0
compute-0-7             lx26-amd64     24  8.08   94.6G    6.0G     0.0     0.0
compute-0-8             lx26-amd64     24  6.12   94.6G    8.4G     0.0     0.0
compute-0-9             lx26-amd64     24  6.12   94.6G    5.9G     0.0     0.0
compute-1-0             lx26-amd64     80 30.13  378.7G   36.2G     0.0     0.0
compute-1-1             lx26-amd64     80 28.93  378.7G   21.8G  996.2M  168.1M
compute-1-2             lx26-amd64     80 29.84  378.7G   23.2G  996.2M   46.8M
compute-1-3             lx26-amd64     80 27.03  378.7G   24.4G  996.2M   39.3M
compute-1-4             lx26-amd64     80 28.05  378.7G   23.2G  996.2M  122.0M
compute-1-5             lx26-amd64     80 27.47  378.7G   23.5G  996.2M  161.4M
compute-1-6             lx26-amd64     80 25.07  378.7G   25.6G  996.2M   91.5M
compute-1-7             lx26-amd64     80 26.98  378.7G   22.8G  996.2M  115.9M
compute-2-0             lx26-amd64     32 11.03   47.2G    2.6G 1000.0M   67.1M
compute-2-1             lx26-amd64     32  8.35   47.2G    3.7G 1000.0M   11.4M
compute-2-2             lx26-amd64     32 10.10   47.2G    1.7G 1000.0M  126.5M
compute-2-3             lx26-amd64     32  7.02   47.2G    3.4G 1000.0M   11.3M

So, it would seem to me we've got plenty of actual resources free, but our 
virtual_free complex seems to be doing something funny/misguided?

I'm worried that our virtual_free complex might actually be doing more harm 
than god here

Here is an example of some qhost –F output on two different node types:

compute-2-3             lx26-amd64     32  7.00   47.2G    3.4G 1000.0M   11.3M
   hl:arch=lx26-amd64
   hl:num_proc=32.000000
   hl:mem_total=47.187G
   hl:swap_total=999.992M
   hl:virtual_total=48.163G
   hl:load_avg=7.000000
   hl:load_short=7.000000
   hl:load_medium=7.000000
   hl:load_long=7.060000
   hl:mem_free=43.788G
   hl:swap_free=988.703M
   hc:virtual_free=4.000G
   hl:mem_used=3.398G
   hl:swap_used=11.289M
   hl:virtual_used=3.409G
   hl:cpu=6.400000
   hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
   hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
   hl:m_socket=2.000000
   hl:m_core=16.000000
   hl:np_load_avg=0.218750
   hl:np_load_short=0.218750
   hl:np_load_medium=0.218750
   hl:np_load_long=0.220625

compute-1-7             lx26-amd64     80 27.83  378.7G   22.8G  996.2M  115.9M
   hl:arch=lx26-amd64
   hl:num_proc=80.000000
   hl:mem_total=378.652G
   hl:swap_total=996.207M
   hl:virtual_total=379.624G
   hl:load_avg=27.830000
   hl:load_short=29.050000
   hl:load_medium=27.830000
   hl:load_long=27.360000
   hl:mem_free=355.814G
   hl:swap_free=880.266M
   hc:virtual_free=13.000G
   hl:mem_used=22.838G
   hl:swap_used=115.941M
   hl:virtual_used=22.951G
   hl:cpu=33.600000
   
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
   
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
   hl:m_socket=4.000000
   hl:m_core=40.000000
   hl:np_load_avg=0.347875
   hl:np_load_short=0.363125
   hl:np_load_medium=0.347875
   hl:np_load_long=0.342000

Our virtual free complex, designated as memory complex, relation <=, is request 
able, is set as a consumable and has a default of 2.

I guess what I'd like to aim for is some sane memory management and a way of 
setting up some "rules" for my users so they can allocate sensible amounts of 
RAM, that reflect really what the hosts/execution nodes are capable of.

I've got (unfortunately!) three types of nodes in the one queue. One type has 
384GB of RAM. One type has 96GB of RAM. One type has 48GB of RAM.

Are my users just expecting too much? Are there some caps/resource limits I 
should put in place to manage expectations or simplyinvest in some "big memory" 
nodes for really large jobs and make a separate highmem.q for such tasks? 
You'll see above some users have tried asking for 100GB as the mem_free complex 
is used.

Thoughts/experiences/ideas?

Thanks for your time, all.

--JC

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

[gridengine users] Memory allocation woes. Any thoughts?

Reply via email to