Am 07.12.2012 um 11:31 schrieb Jake Carroll: > Hi all. > > We've got some memory allocation/memory contention issues our users are > complaining about. Many are saying they can't get their jobs to run because > of memory resource issues. > > An example: > > scheduling info: > (-l h_vmem=24G,virtual_free=24G) cannot run at host > "compute-2-3.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-12.local" because it offers only hc:virtual_free=12.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-6.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-10.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-11.local" because it offers only hc:virtual_free=2.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-9.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-2-1.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-3.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-0.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-4.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-14.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-8.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-1-6.local" because it offers only hc:virtual_free=5.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-2-2.local" because it offers only hc:virtual_free=12.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-5.local" because it offers only hc:virtual_free=4.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-1-3.local" because it offers only hc:virtual_free=5.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-0-7.local" because it offers only hc:virtual_free=12.000G > (-l h_vmem=24G,virtual_free=24G) cannot run at > host "compute-1-5.local" because it offers only hc:virtual_free=5.000G > > Another example, of a user who's job is successfully running: > > hard resource_list: mem_free=100G > mail_list: xyz > notify: FALSE > job_name: mlmassoc_GRMi > stdout_path_list: NONE:NONE:/commented.out > jobshare: 0 > env_list: > script_file: /commented.out > usage 1: cpu=2:08:09:22, mem=712416.09719 GBs, io=0.59519, > vmem=3.379G, maxvmem=4.124G > > If I look at the qhost outputs: > > [root@cluster ~]# qhost > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO > SWAPUS > ------------------------------------------------------------------------------- > global - - - - - - > - > compute-0-0 lx26-amd64 24 6.49 94.6G 5.5G 0.0 > 0.0 > compute-0-1 lx26-amd64 24 10.71 94.6G 5.9G 0.0 > 0.0 > compute-0-10 lx26-amd64 24 6.09 94.6G 5.1G 0.0 > 0.0 > compute-0-11 lx26-amd64 24 6.10 94.6G 5.5G 0.0 > 0.0 > compute-0-12 lx26-amd64 24 6.12 94.6G 8.1G 0.0 > 0.0 > compute-0-13 lx26-amd64 24 8.41 94.6G 5.3G 0.0 > 0.0 > compute-0-14 lx26-amd64 24 7.32 94.6G 7.6G 0.0 > 0.0 > compute-0-15 lx26-amd64 24 10.42 94.6G 6.3G 0.0 > 0.0 > compute-0-2 lx26-amd64 24 9.67 94.6G 5.5G 0.0 > 0.0 > compute-0-3 lx26-amd64 24 7.17 94.6G 5.5G 0.0 > 0.0 > compute-0-4 lx26-amd64 24 6.13 94.6G 4.0G 996.2M > 27.5M > compute-0-5 lx26-amd64 24 6.36 94.6G 5.4G 0.0 > 0.0 > compute-0-6 lx26-amd64 24 6.35 94.6G 6.4G 0.0 > 0.0 > compute-0-7 lx26-amd64 24 8.08 94.6G 6.0G 0.0 > 0.0 > compute-0-8 lx26-amd64 24 6.12 94.6G 8.4G 0.0 > 0.0 > compute-0-9 lx26-amd64 24 6.12 94.6G 5.9G 0.0 > 0.0 > compute-1-0 lx26-amd64 80 30.13 378.7G 36.2G 0.0 > 0.0 > compute-1-1 lx26-amd64 80 28.93 378.7G 21.8G 996.2M > 168.1M > compute-1-2 lx26-amd64 80 29.84 378.7G 23.2G 996.2M > 46.8M > compute-1-3 lx26-amd64 80 27.03 378.7G 24.4G 996.2M > 39.3M > compute-1-4 lx26-amd64 80 28.05 378.7G 23.2G 996.2M > 122.0M > compute-1-5 lx26-amd64 80 27.47 378.7G 23.5G 996.2M > 161.4M > compute-1-6 lx26-amd64 80 25.07 378.7G 25.6G 996.2M > 91.5M > compute-1-7 lx26-amd64 80 26.98 378.7G 22.8G 996.2M > 115.9M > compute-2-0 lx26-amd64 32 11.03 47.2G 2.6G 1000.0M > 67.1M > compute-2-1 lx26-amd64 32 8.35 47.2G 3.7G 1000.0M > 11.4M > compute-2-2 lx26-amd64 32 10.10 47.2G 1.7G 1000.0M > 126.5M > compute-2-3 lx26-amd64 32 7.02 47.2G 3.4G 1000.0M > 11.3M > > So, it would seem to me we've got plenty of actual resources free, but our > virtual_free complex seems to be doing something funny/misguided? > > I'm worried that our virtual_free complex might actually be doing more harm > than god here
Can you check for the jobs on one or two of these nodes what was requested for virtual_free (`qstat -j <job_id>` should list it)? I assume the output of: $ qhost -F virtual_free states only the output you send above. -- Reuti > Here is an example of some qhost –F output on two different node types: > > compute-2-3 lx26-amd64 32 7.00 47.2G 3.4G 1000.0M > 11.3M > hl:arch=lx26-amd64 > hl:num_proc=32.000000 > hl:mem_total=47.187G > hl:swap_total=999.992M > hl:virtual_total=48.163G > hl:load_avg=7.000000 > hl:load_short=7.000000 > hl:load_medium=7.000000 > hl:load_long=7.060000 > hl:mem_free=43.788G > hl:swap_free=988.703M > hc:virtual_free=4.000G > hl:mem_used=3.398G > hl:swap_used=11.289M > hl:virtual_used=3.409G > hl:cpu=6.400000 > hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT > hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT > hl:m_socket=2.000000 > hl:m_core=16.000000 > hl:np_load_avg=0.218750 > hl:np_load_short=0.218750 > hl:np_load_medium=0.218750 > hl:np_load_long=0.220625 > > compute-1-7 lx26-amd64 80 27.83 378.7G 22.8G 996.2M > 115.9M > hl:arch=lx26-amd64 > hl:num_proc=80.000000 > hl:mem_total=378.652G > hl:swap_total=996.207M > hl:virtual_total=379.624G > hl:load_avg=27.830000 > hl:load_short=29.050000 > hl:load_medium=27.830000 > hl:load_long=27.360000 > hl:mem_free=355.814G > hl:swap_free=880.266M > hc:virtual_free=13.000G > hl:mem_used=22.838G > hl:swap_used=115.941M > hl:virtual_used=22.951G > hl:cpu=33.600000 > > hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT > > hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT > hl:m_socket=4.000000 > hl:m_core=40.000000 > hl:np_load_avg=0.347875 > hl:np_load_short=0.363125 > hl:np_load_medium=0.347875 > hl:np_load_long=0.342000 > > Our virtual free complex, designated as memory complex, relation <=, is > request able, is set as a consumable and has a default of 2. > > I guess what I'd like to aim for is some sane memory management and a way of > setting up some "rules" for my users so they can allocate sensible amounts of > RAM, that reflect really what the hosts/execution nodes are capable of. > > I've got (unfortunately!) three types of nodes in the one queue. One type has > 384GB of RAM. One type has 96GB of RAM. One type has 48GB of RAM. > > Are my users just expecting too much? Are there some caps/resource limits I > should put in place to manage expectations or simplyinvest in some "big > memory" nodes for really large jobs and make a separate highmem.q for such > tasks? You'll see above some users have tried asking for 100GB as the > mem_free complex is used. > > Thoughts/experiences/ideas? > > Thanks for your time, all. > > --JC > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users