Am 12.12.2012 um 02:17 schrieb Jake Carroll: > Cool. > > Thanks for the response guys. See in line: > > > On 12/12/12 6:45 AM, "Reuti" <[email protected]> wrote: > >> Am 11.12.2012 um 21:32 schrieb Gowtham: >> >>> I second Alex's thoughts. In all our clusters, we only use h_vmem >> >> The difference is that virtual_free is only a guidance for SGE, but >> h_vmem will also be enforced. It depends on the working style of the >> users/groups which you want to prefer to use. Is only one group using a >> cluster I prefer virtual_free, as they are checking their results and >> prediction of memory requests, but with many groups in a cluster >> enforcing h_vmem might be more suitable to avoid oversubscription. > > That is indeed our situation. It's very much a multi-tenancy environment > [probably about 50 or 60 users and 10 groups of therein]. So, to that end, > I should enable/allow users to make h_vmem requestable and set it as a > consumable?
Yes, you also need to attach an initial value to each exechost set to the built in physical RAM. - Reuti > Cheers. > > --JC > > >> >> -- Reuti >> >> >>> (to indicate the hard cap per job) and mem_free (a suggestion to >>> the scheduler as to which node the job should be started on). >>> >>> Best regards, >>> g >>> >>> -- >>> Gowtham >>> Information Technology Services >>> Michigan Technological University >>> >>> (906) 487/3593 >>> http://www.it.mtu.edu/ >>> >>> >>> On Tue, 11 Dec 2012, Alex Chekholko wrote: >>> >>> | Hi Jake, >>> | >>> | You can do 'qhost -F h_vmem,mem_free,virtual_free', that might be a >>> useful >>> | view for you. >>> | >>> | In general, I've only ever used one of the three complexes above. >>> | >>> | Which one(s) do you have defined for the execution hosts? e.g. >>> | qconf -se compute-1-7 >>> | >>> | h_vmem will map to 'ulimit -v' >>> | mem_free just tracks 'free' >>> | virtual_free I'm not sure, I'd have to search the mailing list >>> archives. >>> | >>> | I recommend you just use one of those three complexes. If you want >>> to set a >>> | hard memory limit for jobs, use h_vmem. If you want to just suggest >>> to the >>> | scheduler, use mem_free, it will use the current instantaneous >>> mem_free level >>> | during job scheduling (well, the lower of the consumable mem_free (if >>> you >>> | havve that defined) and the actual current mem_free). >>> | >>> | What is the compelling reason to use virtual_free? I guess it >>> includes swap? >>> | >>> | Regards, >>> | Alex >>> | >>> | >>> | On 12/7/12 2:31 AM, Jake Carroll wrote: >>> | > Hi all. >>> | > >>> | > We've got some memory allocation/memory contention issues our users >>> are >>> | > complaining about. Many are saying they can't get their jobs to run >>> | > because of memory resource issues. >>> | > >>> | > An example: >>> | > >>> | > scheduling info: >>> | > (-l h_vmem=24G,virtual_free=24G) cannot run at host >>> | > "compute-2-3.local" because it offers only hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-12.local" because it offers only >>> hc:virtual_free=12.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-6.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-10.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-11.local" because it offers only >>> hc:virtual_free=2.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-9.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-2-1.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-3.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-0.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-4.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-14.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-8.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-1-6.local" because it offers only >>> hc:virtual_free=5.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-2-2.local" because it offers only >>> hc:virtual_free=12.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-5.local" because it offers only >>> hc:virtual_free=4.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-1-3.local" because it offers only >>> hc:virtual_free=5.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-0-7.local" because it offers only >>> hc:virtual_free=12.000G >>> | > (-l h_vmem=24G,virtual_free=24G) >>> cannot run >>> | > at host "compute-1-5.local" because it offers only >>> hc:virtual_free=5.000G >>> | > >>> | > Another example, of a user who's job is successfully running: >>> | > >>> | > hard resource_list: mem_free=100G >>> | > mail_list: xyz >>> | > notify: FALSE >>> | > job_name: mlmassoc_GRMi >>> | > stdout_path_list: NONE:NONE:/commented.out >>> | > jobshare: 0 >>> | > env_list: >>> | > script_file: /commented.out >>> | > usage 1: cpu=2:08:09:22, mem=712416.09719 GBs, >>> | > io=0.59519, vmem=3.379G, maxvmem=4.124G >>> | > >>> | > If I look at the qhost outputs: >>> | > >>> | > [root@cluster ~]# qhost >>> | > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE >>> SWAPTO >>> | > SWAPUS >>> | > >>> | > >>> ------------------------------------------------------------------------- >>> ------ >>> | > global - - - - - >>> - >>> | > - >>> | > compute-0-0 lx26-amd64 24 6.49 94.6G 5.5G >>> 0.0 >>> | > 0.0 >>> | > compute-0-1 lx26-amd64 24 10.71 94.6G 5.9G >>> 0.0 >>> | > 0.0 >>> | > compute-0-10 lx26-amd64 24 6.09 94.6G 5.1G >>> 0.0 >>> | > 0.0 >>> | > compute-0-11 lx26-amd64 24 6.10 94.6G 5.5G >>> 0.0 >>> | > 0.0 >>> | > compute-0-12 lx26-amd64 24 6.12 94.6G 8.1G >>> 0.0 >>> | > 0.0 >>> | > compute-0-13 lx26-amd64 24 8.41 94.6G 5.3G >>> 0.0 >>> | > 0.0 >>> | > compute-0-14 lx26-amd64 24 7.32 94.6G 7.6G >>> 0.0 >>> | > 0.0 >>> | > compute-0-15 lx26-amd64 24 10.42 94.6G 6.3G >>> 0.0 >>> | > 0.0 >>> | > compute-0-2 lx26-amd64 24 9.67 94.6G 5.5G >>> 0.0 >>> | > 0.0 >>> | > compute-0-3 lx26-amd64 24 7.17 94.6G 5.5G >>> 0.0 >>> | > 0.0 >>> | > compute-0-4 lx26-amd64 24 6.13 94.6G 4.0G >>> 996.2M >>> | > 27.5M >>> | > compute-0-5 lx26-amd64 24 6.36 94.6G 5.4G >>> 0.0 >>> | > 0.0 >>> | > compute-0-6 lx26-amd64 24 6.35 94.6G 6.4G >>> 0.0 >>> | > 0.0 >>> | > compute-0-7 lx26-amd64 24 8.08 94.6G 6.0G >>> 0.0 >>> | > 0.0 >>> | > compute-0-8 lx26-amd64 24 6.12 94.6G 8.4G >>> 0.0 >>> | > 0.0 >>> | > compute-0-9 lx26-amd64 24 6.12 94.6G 5.9G >>> 0.0 >>> | > 0.0 >>> | > compute-1-0 lx26-amd64 80 30.13 378.7G 36.2G >>> 0.0 >>> | > 0.0 >>> | > compute-1-1 lx26-amd64 80 28.93 378.7G 21.8G >>> 996.2M >>> | > 168.1M >>> | > compute-1-2 lx26-amd64 80 29.84 378.7G 23.2G >>> 996.2M >>> | > 46.8M >>> | > compute-1-3 lx26-amd64 80 27.03 378.7G 24.4G >>> 996.2M >>> | > 39.3M >>> | > compute-1-4 lx26-amd64 80 28.05 378.7G 23.2G >>> 996.2M >>> | > 122.0M >>> | > compute-1-5 lx26-amd64 80 27.47 378.7G 23.5G >>> 996.2M >>> | > 161.4M >>> | > compute-1-6 lx26-amd64 80 25.07 378.7G 25.6G >>> 996.2M >>> | > 91.5M >>> | > compute-1-7 lx26-amd64 80 26.98 378.7G 22.8G >>> 996.2M >>> | > 115.9M >>> | > compute-2-0 lx26-amd64 32 11.03 47.2G 2.6G >>> 1000.0M >>> | > 67.1M >>> | > compute-2-1 lx26-amd64 32 8.35 47.2G 3.7G >>> 1000.0M >>> | > 11.4M >>> | > compute-2-2 lx26-amd64 32 10.10 47.2G 1.7G >>> 1000.0M >>> | > 126.5M >>> | > compute-2-3 lx26-amd64 32 7.02 47.2G 3.4G >>> 1000.0M >>> | > 11.3M >>> | > >>> | > So, it would seem to me we've got _plenty_ of actual resources >>> free, but >>> | > our virtual_free complex seems to be doing something >>> funny/misguided? >>> | > >>> | > I'm worried that our virtual_free complex might actually be doing >>> more >>> | > harm than god here >>> | > >>> | > Here is an example of some qhost F output on two different node >>> types: >>> | > >>> | > compute-2-3 lx26-amd64 32 7.00 47.2G 3.4G >>> 1000.0M >>> | > 11.3M >>> | > hl:arch=lx26-amd64 >>> | > hl:num_proc=32.000000 >>> | > hl:mem_total=47.187G >>> | > hl:swap_total=999.992M >>> | > hl:virtual_total=48.163G >>> | > hl:load_avg=7.000000 >>> | > hl:load_short=7.000000 >>> | > hl:load_medium=7.000000 >>> | > hl:load_long=7.060000 >>> | > hl:mem_free=43.788G >>> | > hl:swap_free=988.703M >>> | > hc:virtual_free=4.000G >>> | > hl:mem_used=3.398G >>> | > hl:swap_used=11.289M >>> | > hl:virtual_used=3.409G >>> | > hl:cpu=6.400000 >>> | > hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT >>> | > >>> hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT >>> | > hl:m_socket=2.000000 >>> | > hl:m_core=16.000000 >>> | > hl:np_load_avg=0.218750 >>> | > hl:np_load_short=0.218750 >>> | > hl:np_load_medium=0.218750 >>> | > hl:np_load_long=0.220625 >>> | > >>> | > compute-1-7 lx26-amd64 80 27.83 378.7G 22.8G >>> 996.2M >>> | > 115.9M >>> | > hl:arch=lx26-amd64 >>> | > hl:num_proc=80.000000 >>> | > hl:mem_total=378.652G >>> | > hl:swap_total=996.207M >>> | > hl:virtual_total=379.624G >>> | > hl:load_avg=27.830000 >>> | > hl:load_short=29.050000 >>> | > hl:load_medium=27.830000 >>> | > hl:load_long=27.360000 >>> | > hl:mem_free=355.814G >>> | > hl:swap_free=880.266M >>> | > hc:virtual_free=13.000G >>> | > hl:mem_used=22.838G >>> | > hl:swap_used=115.941M >>> | > hl:virtual_used=22.951G >>> | > hl:cpu=33.600000 >>> | > >>> | > >>> hl:m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTT >>> CTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT >>> | > >>> | > >>> hl:m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTT >>> CTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT >>> | > hl:m_socket=4.000000 >>> | > hl:m_core=40.000000 >>> | > hl:np_load_avg=0.347875 >>> | > hl:np_load_short=0.363125 >>> | > hl:np_load_medium=0.347875 >>> | > hl:np_load_long=0.342000 >>> | > >>> | > Our virtual free complex, designated as memory complex, relation >>> <=, is >>> | > request able, is set as a consumable and has a default of 2. >>> | > >>> | > I guess what I'd like to aim for is some sane memory management and >>> a >>> | > way of setting up some "rules" for my users so they can allocate >>> | > sensible amounts of RAM, that reflect really what the >>> hosts/execution >>> | > nodes are capable of. >>> | > >>> | > I've got (unfortunately!) three types of nodes in the one queue. One >>> | > type has 384GB of RAM. One type has 96GB of RAM. One type has 48GB >>> of RAM. >>> | > >>> | > Are my users just expecting too much? Are there some caps/resource >>> | > limits I should put in place to manage expectations or simplyinvest >>> in >>> | > some "big memory" nodes for really large jobs and make a separate >>> | > highmem.q for such tasks? You'll see above some users have tried >>> asking >>> | > for 100GB as the mem_free complex is used. >>> | > >>> | > Thoughts/experiences/ideas? >>> | > >>> | > Thanks for your time, all. >>> | > >>> | > --JC >>> | > >>> | _______________________________________________ >>> | users mailing list >>> | [email protected] >>> | https://gridengine.org/mailman/listinfo/users >>> | _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
