Am 08.02.2013 um 16:22 schrieb Brett Taylor: > So you're saying that setting h_vmem in the queue definition and not > specifying it at submission time will limit the memory usage.
Correct. > Specifying it at submission time will subtract it from the complex but won't > actually limit the memory usage? No. Specifying it at submission time will subtract it from the complex and actually limit the memory usage to this value. -- Reuti > > Brett Taylor > Systems Administrator > Center for Systems and Computational Biology > > The Wistar Institute > 3601 Spruce St. > Room 214 > Philadelphia PA 19104 > Tel: 215-495-6914 > Sending me a large file? Use my secure dropbox: > https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected] > > -----Original Message----- > From: Reuti [mailto:[email protected]] > Sent: Friday, February 08, 2013 6:51 AM > To: Brett Taylor > Cc: [email protected] > Subject: Re: [gridengine users] h_vmem not actually restricting memory usage? > > Am 07.02.2013 um 23:24 schrieb Brett Taylor: > >> Maybe I didn't describe things as clear as I should've. Or maybe I just >> don't understand your response. >> >> My second queue is really only there for "emergencies" when someone needs to >> run something small but the main queue is filled up for days. So right now, >> it's accounting for the slots and the memory as I want, in that I have 142G >> total and once that 142G is spoke for it can't assign more jobs to that >> host, whether it is in the main queue or the secondary queue. But my issue >> is that GE no longer seems to actually be placing a physical limit on the >> scripts that are running. At one time, I was able to say `-pe smp 24 -l >> h_vmem 3.5G` and my script would stay right around 3.5G and finish in 19 >> hours. At other times, the exact same script with the exact same variables >> at submission would then try to use 35+G (I am pretty sure these are per >> core/slot numbers)and take 3 days to complete, which is the same as if I had >> no h_vmem settings at all. >> >> So I guess the sizzled down version of my question is: would dropping back >> to only one queue, setting the complex config to JOB, then setting the queue >> config to "h_vmem 142G" > > Unless you request h_vmem in the job submission, a setting of the h_vmem in > the queue configuration will put an upper on the job's usage of memory (and > the the possible request per job) but it won't be withdrawn automatically > from the complex. > > >> fix the issue with my script and get it back to the ~19 hour speed? > > I can't make any reliable statement as I don't know your application in > detail. > > -- Reuti > > >> Brett Taylor >> Systems Administrator >> Center for Systems and Computational Biology >> >> The Wistar Institute >> 3601 Spruce St. >> Room 214 >> Philadelphia PA 19104 >> Tel: 215-495-6914 >> Sending me a large file? Use my secure dropbox: >> https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected] >> >> >> -----Original Message----- >> From: Reuti [mailto:[email protected]] >> Sent: Thursday, February 07, 2013 4:58 PM >> To: Brett Taylor >> Cc: [email protected] >> Subject: Re: [gridengine users] h_vmem not actually restricting memory usage? >> >> Am 07.02.2013 um 21:42 schrieb Brett Taylor: >> >>> Hello, >>> >>> I've been testing out the h_vmem settings for a while now, and currently I >>> have this setup: >>> >>> Exec host >>> complex_values slots=36,h_vmem=142G >>> high_priority.q >>> h_vmem INFINITY >> >> This is not the consumable complex per se. This you would also add in the >> queue_definition's complex_values to have a consumable per queue instance. >> >> >>> slots 24 >>> priority 0 >>> low_priority.q >>> h_vmem INFINITY >>> priority 18 >>> slots 12 >>> qconf -sc >>> h_vmem h_vmem MEMORY <= YES YES >>> 3.95G 0 >>> >>> I know that there has been discussion of a bug with respect to setting the >>> complex to JOB, which is why I settled on this configuration a few months >>> ago in order to have two queues without oversubscribing the memory. >>> However, this doesn't seem to actually limit the memory usage during run >>> time, like I have seen GE do before. >>> >>> I have one script that I have been using to benchmark my cluster and figure >>> out the queue stats. It runs tophat and bowtie and my metrics for knowing >>> if the memory is being limited are the "Max vmem:" and "Wall clock time:" >>> stats. If the memory isn't limited, then if I submit the job using 24 >>> cores, I'll see "Max vmem: 35.342G" and a wall clock time around >>> 2:20:00:00. When I was able to limit the vmem, I saw stats more like " >>> Wallclock Time = 19:51:49... Max vmem = 3.932G". As you can see, >>> 19 hours is a lot quicker than 2 days. >> >> This sounds like the application is confused by to much memory you mean? >> >> -- Reuti >> >> >>> I don't have definitive proof, but I think changing to JOB and setting a >>> limit in the queue definition, instead of INFINITY, might restore the >>> actual runtime limit. But, then I wouldn't be able to have two queues in >>> the way I have them now. I'd like to test this myself but my tiny cluster >>> is full at the moment. Can anyone confirm these settings for me? >>> >>> Thanks, >>> Brett >>> >>> >>> Brett Taylor >>> Systems Administrator >>> Center for Systems and Computational Biology >>> >>> The Wistar Institute >>> 3601 Spruce St. >>> Room 214 >>> Philadelphia PA 19104 >>> Tel: 215-495-6914 >>> Sending me a large file? Use my secure dropbox: >>> https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected] >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >>> >> >> >> -- >> This email was Anti Virus checked by Astaro Security Gateway. >> http://www.astaro.com >> > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
