Re: [gridengine users] h_vmem not actually restricting memory usage?

Reuti Fri, 08 Feb 2013 07:26:01 -0800

Am 08.02.2013 um 16:22 schrieb Brett Taylor:

> So you're saying that setting h_vmem in the queue definition and not 
> specifying it at submission time will limit the memory usage.


Correct.


>  Specifying it at submission time will subtract it from the complex but won't 
> actually limit the memory usage?

No.

Specifying it at submission time will subtract it from the complex and actually 
limit the memory usage to this value.

-- Reuti


> 
> Brett Taylor
> Systems Administrator
> Center for Systems and Computational Biology
> 
> The Wistar Institute
> 3601 Spruce St.
> Room 214
> Philadelphia PA 19104
> Tel: 215-495-6914
> Sending me a large file? Use my secure dropbox:
> https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected]
> 
> -----Original Message-----
> From: Reuti [mailto:[email protected]] 
> Sent: Friday, February 08, 2013 6:51 AM
> To: Brett Taylor
> Cc: [email protected]
> Subject: Re: [gridengine users] h_vmem not actually restricting memory usage?
> 
> Am 07.02.2013 um 23:24 schrieb Brett Taylor:
> 
>> Maybe I didn't describe things as clear as I should've.  Or maybe I just 
>> don't understand your response.  
>> 
>> My second queue is really only there for "emergencies" when someone needs to 
>> run something small but the main queue is filled up for days.  So right now, 
>> it's accounting for the slots and the memory as I want, in that I have 142G 
>> total and once that 142G is spoke for it can't assign more jobs to that 
>> host, whether it is in the main queue or the secondary queue.  But my issue 
>> is that GE no longer seems to actually be placing a physical limit on the 
>> scripts that are running.  At one time, I was able to say `-pe smp 24 -l 
>> h_vmem 3.5G` and my script would stay right around 3.5G and finish in 19 
>> hours.  At other times, the exact same script with the exact same variables 
>> at submission would then try to use 35+G (I am pretty sure these are per 
>> core/slot numbers)and take 3 days to complete, which is the same as if I had 
>> no h_vmem settings at all.
>> 
>> So I guess the sizzled down version of my question is: would dropping back 
>> to only one queue, setting the complex config to JOB, then setting the queue 
>> config to "h_vmem 142G"
> 
> Unless you request h_vmem in the job submission, a setting of the h_vmem in 
> the queue configuration will put an upper on the job's usage of memory (and 
> the the possible request per job) but it won't be withdrawn automatically 
> from the complex.
> 
> 
>> fix the issue with my script and get it back to the ~19 hour speed?
> 
> I can't make any reliable statement as I don't know your application in 
> detail.
> 
> -- Reuti
> 
> 
>> Brett Taylor
>> Systems Administrator
>> Center for Systems and Computational Biology
>> 
>> The Wistar Institute
>> 3601 Spruce St.
>> Room 214
>> Philadelphia PA 19104
>> Tel: 215-495-6914
>> Sending me a large file? Use my secure dropbox:
>> https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected]
>> 
>> 
>> -----Original Message-----
>> From: Reuti [mailto:[email protected]] 
>> Sent: Thursday, February 07, 2013 4:58 PM
>> To: Brett Taylor
>> Cc: [email protected]
>> Subject: Re: [gridengine users] h_vmem not actually restricting memory usage?
>> 
>> Am 07.02.2013 um 21:42 schrieb Brett Taylor:
>> 
>>> Hello,
>>> 
>>> I've been testing out the h_vmem settings for a while now, and currently I 
>>> have this setup:
>>> 
>>> Exec host
>>>     complex_values        slots=36,h_vmem=142G
>>> high_priority.q
>>>     h_vmem INFINITY
>> 
>> This is not the consumable complex per se. This you would also add in the 
>> queue_definition's complex_values to have a consumable per queue instance.
>> 
>> 
>>>     slots                 24
>>>     priority              0
>>> low_priority.q
>>>     h_vmem INFINITY
>>>     priority              18
>>>     slots                 12
>>> qconf -sc
>>>     h_vmem              h_vmem     MEMORY      <=    YES         YES        
>>> 3.95G    0
>>> 
>>> I know that there has been discussion of a bug with respect to setting the 
>>> complex to JOB, which is why I settled on this configuration a few months 
>>> ago in order to have two queues without oversubscribing the memory.  
>>> However, this doesn't seem to actually limit the memory usage during run 
>>> time, like I have seen GE do before.
>>> 
>>> I have one script that I have been using to benchmark my cluster and figure 
>>> out the queue stats.  It runs tophat and bowtie and my metrics for knowing 
>>> if the memory is being limited are the "Max vmem:" and "Wall clock time:" 
>>> stats.  If the memory isn't limited, then if I submit the job using 24 
>>> cores, I'll see "Max vmem: 35.342G" and a wall clock time around 
>>> 2:20:00:00.  When I was able to limit the vmem, I saw stats more like " 
>>> Wallclock Time   = 19:51:49... Max vmem         = 3.932G".  As you can see, 
>>> 19 hours is a lot quicker than 2 days.
>> 
>> This sounds like the application is confused by to much memory you mean?
>> 
>> -- Reuti
>> 
>> 
>>> I don't have definitive proof, but I think changing to JOB and setting a 
>>> limit in the queue definition, instead of INFINITY, might restore the 
>>> actual runtime limit. But, then I wouldn't be able to have two queues in 
>>> the way I have them now.  I'd like to test this myself but my tiny cluster 
>>> is full at the moment. Can anyone confirm these settings for me?
>>> 
>>> Thanks,
>>> Brett
>>> 
>>> 
>>> Brett Taylor
>>> Systems Administrator
>>> Center for Systems and Computational Biology
>>> 
>>> The Wistar Institute
>>> 3601 Spruce St.
>>> Room 214
>>> Philadelphia PA 19104
>>> Tel: 215-495-6914
>>> Sending me a large file? Use my secure dropbox:
>>> https://cscb-filetransfer.wistar.upenn.edu/dropbox/[email protected]
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>>> 
>> 
>> 
>> -- 
>> This email was Anti Virus checked by Astaro Security Gateway. 
>> http://www.astaro.com
>> 
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] h_vmem not actually restricting memory usage?

Reply via email to