Hi Jake,

You can do 'qhost -F h_vmem,mem_free,virtual_free', that might be a useful view for you.

In general, I've only ever used one of the three complexes above.

Which one(s) do you have defined for the execution hosts? e.g.
qconf -se compute-1-7

h_vmem will map to 'ulimit -v'
mem_free just tracks 'free'
virtual_free I'm not sure, I'd have to search the mailing list archives.

I recommend you just use one of those three complexes. If you want to set a hard memory limit for jobs, use h_vmem. If you want to just suggest to the scheduler, use mem_free, it will use the current instantaneous mem_free level during job scheduling (well, the lower of the consumable mem_free (if you havve that defined) and the actual current mem_free).

What is the compelling reason to use virtual_free? I guess it includes swap?

Regards,
Alex


On 12/7/12 2:31 AM, Jake Carroll wrote:
Hi all.

We've got some memory allocation/memory contention issues our users are
complaining about. Many are saying they can't get their jobs to run
because of memory resource issues.

An example:

scheduling info:
             (-l h_vmem=24G,virtual_free=24G) cannot run at host
"compute-2-3.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-12.local" because it offers only hc:virtual_free=12.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-6.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-10.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-11.local" because it offers only hc:virtual_free=2.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-9.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-2-1.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-3.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-0.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-4.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-14.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-8.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-6.local" because it offers only hc:virtual_free=5.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-2-2.local" because it offers only hc:virtual_free=12.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-5.local" because it offers only hc:virtual_free=4.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-3.local" because it offers only hc:virtual_free=5.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-0-7.local" because it offers only hc:virtual_free=12.000G
                             (-l h_vmem=24G,virtual_free=24G) cannot run
at host "compute-1-5.local" because it offers only hc:virtual_free=5.000G

Another example, of a user who's job is successfully running:

hard resource_list:         mem_free=100G
mail_list:                  xyz
notify:                     FALSE
job_name:                   mlmassoc_GRMi
stdout_path_list:           NONE:NONE:/commented.out
jobshare:                   0
env_list:
script_file:                /commented.out
usage    1:                 cpu=2:08:09:22, mem=712416.09719 GBs,
io=0.59519, vmem=3.379G, maxvmem=4.124G

If I look at the qhost outputs:

[root@cluster ~]# qhost
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO
  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -
       -
compute-0-0             lx26-amd64     24  6.49   94.6G    5.5G     0.0
     0.0
compute-0-1             lx26-amd64     24 10.71   94.6G    5.9G     0.0
     0.0
compute-0-10            lx26-amd64     24  6.09   94.6G    5.1G     0.0
     0.0
compute-0-11            lx26-amd64     24  6.10   94.6G    5.5G     0.0
     0.0
compute-0-12            lx26-amd64     24  6.12   94.6G    8.1G     0.0
     0.0
compute-0-13            lx26-amd64     24  8.41   94.6G    5.3G     0.0
     0.0
compute-0-14            lx26-amd64     24  7.32   94.6G    7.6G     0.0
     0.0
compute-0-15            lx26-amd64     24 10.42   94.6G    6.3G     0.0
     0.0
compute-0-2             lx26-amd64     24  9.67   94.6G    5.5G     0.0
     0.0
compute-0-3             lx26-amd64     24  7.17   94.6G    5.5G     0.0
     0.0
compute-0-4             lx26-amd64     24  6.13   94.6G    4.0G  996.2M
   27.5M
compute-0-5             lx26-amd64     24  6.36   94.6G    5.4G     0.0
     0.0
compute-0-6             lx26-amd64     24  6.35   94.6G    6.4G     0.0
     0.0
compute-0-7             lx26-amd64     24  8.08   94.6G    6.0G     0.0
     0.0
compute-0-8             lx26-amd64     24  6.12   94.6G    8.4G     0.0
     0.0
compute-0-9             lx26-amd64     24  6.12   94.6G    5.9G     0.0
     0.0
compute-1-0             lx26-amd64     80 30.13  378.7G   36.2G     0.0
     0.0
compute-1-1             lx26-amd64     80 28.93  378.7G   21.8G  996.2M
  168.1M
compute-1-2             lx26-amd64     80 29.84  378.7G   23.2G  996.2M
   46.8M
compute-1-3             lx26-amd64     80 27.03  378.7G   24.4G  996.2M
   39.3M
compute-1-4             lx26-amd64     80 28.05  378.7G   23.2G  996.2M
  122.0M
compute-1-5             lx26-amd64     80 27.47  378.7G   23.5G  996.2M
  161.4M
compute-1-6             lx26-amd64     80 25.07  378.7G   25.6G  996.2M
   91.5M
compute-1-7             lx26-amd64     80 26.98  378.7G   22.8G  996.2M
  115.9M
compute-2-0             lx26-amd64     32 11.03   47.2G    2.6G 1000.0M
   67.1M
compute-2-1             lx26-amd64     32  8.35   47.2G    3.7G 1000.0M
   11.4M
compute-2-2             lx26-amd64     32 10.10   47.2G    1.7G 1000.0M
  126.5M
compute-2-3             lx26-amd64     32  7.02   47.2G    3.4G 1000.0M
   11.3M

So, it would seem to me we've got _plenty_ of actual resources free, but
our virtual_free complex seems to be doing something funny/misguided?

I'm worried that our virtual_free complex might actually be doing more
harm than god here

Here is an example of some qhost –F output on two different node types:

compute-2-3             lx26-amd64     32  7.00   47.2G    3.4G 1000.0M
   11.3M
    hl:arch=lx26-amd64
    hl:num_proc=32.000000
    hl:mem_total=47.187G
    hl:swap_total=999.992M
    hl:virtual_total=48.163G
    hl:load_avg=7.000000
    hl:load_short=7.000000
    hl:load_medium=7.000000
    hl:load_long=7.060000
    hl:mem_free=43.788G
    hl:swap_free=988.703M
    hc:virtual_free=4.000G
    hl:mem_used=3.398G
    hl:swap_used=11.289M
    hl:virtual_used=3.409G
    hl:cpu=6.400000
    hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
    hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT
    hl:m_socket=2.000000
    hl:m_core=16.000000
    hl:np_load_avg=0.218750
    hl:np_load_short=0.218750
    hl:np_load_medium=0.218750
    hl:np_load_long=0.220625

compute-1-7             lx26-amd64     80 27.83  378.7G   22.8G  996.2M
  115.9M
    hl:arch=lx26-amd64
    hl:num_proc=80.000000
    hl:mem_total=378.652G
    hl:swap_total=996.207M
    hl:virtual_total=379.624G
    hl:load_avg=27.830000
    hl:load_short=29.050000
    hl:load_medium=27.830000
    hl:load_long=27.360000
    hl:mem_free=355.814G
    hl:swap_free=880.266M
    hc:virtual_free=13.000G
    hl:mem_used=22.838G
    hl:swap_used=115.941M
    hl:virtual_used=22.951G
    hl:cpu=33.600000

  
hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT

  
hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT
    hl:m_socket=4.000000
    hl:m_core=40.000000
    hl:np_load_avg=0.347875
    hl:np_load_short=0.363125
    hl:np_load_medium=0.347875
    hl:np_load_long=0.342000

Our virtual free complex, designated as memory complex, relation <=, is
request able, is set as a consumable and has a default of 2.

I guess what I'd like to aim for is some sane memory management and a
way of setting up some "rules" for my users so they can allocate
sensible amounts of RAM, that reflect really what the hosts/execution
nodes are capable of.

I've got (unfortunately!) three types of nodes in the one queue. One
type has 384GB of RAM. One type has 96GB of RAM. One type has 48GB of RAM.

Are my users just expecting too much? Are there some caps/resource
limits I should put in place to manage expectations or simplyinvest in
some "big memory" nodes for really large jobs and make a separate
highmem.q for such tasks? You'll see above some users have tried asking
for 100GB as the mem_free complex is used.

Thoughts/experiences/ideas?

Thanks for your time, all.

--JC

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to