Re: [gridengine users] Questions about h_vmem

Reuti Tue, 08 Apr 2014 09:17:26 -0700

Am 08.04.2014 um 16:45 schrieb Fan Dong:

> 
> On 04/08/2014 10:41 AM, Reuti wrote:
>> Am 08.04.2014 um 16:34 schrieb Fan Dong:
>> 
>>> On 04/08/2014 10:17 AM, Reuti wrote:
>>>> Am 08.04.2014 um 16:02 schrieb Fan Dong:
>>>> 
>>>>> Thanks for the help.  I guess part of my original question was 'will 
>>>>> h_vmem help the scheduler to hold off the job if the node does not have 
>>>>> enough h_vmem left?'
>>>>> 
>>>>> Say, we have
>>>>>   • a consumable h_vmem (qconf -mc) with default value 4GB,
>>>>>   • the exec host h1 and h2 both have h_vmem = 32GB  (qconf -me),
>>>>>   • the queue a.q is configured with 18GB h_vmem (qconf -mq).
>>>>> 
>>>>> What happens a user sends 3 jobs to a.q, assuming there are more than two 
>>>>> slots on each of the host ? -- will
>>>>>   • 3 jobs get to run simultaneously?
>>>> Yes. 4 GB times 3 will fit into the available 32 GB.
>>>> 
>>> Then what is the use of h_vmem setup in the queue??? h_vmem has the value 
>>> of 18GB in a.q,  how does that come into the play?
>> It is the maximum a user can request per job.
>> 
>> They get 4 GB by default, but they can request more - up to 18 GB for a 
>> particular job. In case they request more than 18GB , the job will never 
>> start.
>> 
>> Nevertheless, the overall consumption of memory will be restricted by the 
>> definition on the host level, i.e. that all jobs in total on an exechost 
>> will never exceed 32 GB.
> Excellent!  but just to double check - if a user does not explicitly use qsub 
> -l h_vmem in the submission script, the default 4GB will be used. Is that 
> correct?


As long as the "h_vmem" complex is assigned to each exechost with a suitable 
value: yes.

-- Reuti


> 
>> -- Reuti
>> 
>> 
>>>  Shouldn't the h_vmem in the queue override the default global consumable 
>>> value??  You suggested earlier that the h_vmem attached to the queue is 
>>> enforced per job but your calculation '4GB times 3' seems ignore the h_vmem 
>>> in the queue.  Could you please clarify?  Thank you.
>>> 
>>> 
>>> 
>>>> In case the user requests more memory, like 18 GB for each of them, it 
>>>> will be different of course.
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>>   • or there is a job has to be held off ?  (because h_vmem on each of 
>>>>> the host will decrease to 32-18=14G, not enough for the third job)
>>>>> 
>>>>> 
>>>>>  On 04/07/2014 11:29 AM, Reuti wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Am 07.04.2014 um 17:10 schrieb Fan Dong:
>>>>>> 
>>>>>> 
>>>>>>> I am a little confused about the consumable h_vmem setup on the node 
>>>>>>> and the queue.  Let's say we have one queue, called a.q, spans two 
>>>>>>> host, h1 and h2.  h1 has 32GB of ram and h2 has 128GB.
>>>>>>> 
>>>>>>> I attached h_vmem to both hosts, using the value of actual physical ram,
>>>>>>> 
>>>>>> You defined this value `qconf -me ...` => "complex_values"?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> also a.q has default h_vmem value of 18GB, which is the peak memory 
>>>>>>> usage of the job.
>>>>>>> 
>>>>>> Yes, the setting in the queue is per job, while in the exechost 
>>>>>> definition it's across all jobs.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Here is how I understand the way h_vmem works.  When the first job in 
>>>>>>> a.q is sent to node h1, the h_vmem on the node will decrease to 
>>>>>>> 32-18=14GB,
>>>>>>> 
>>>>>> Did you make the "h_vmem" complex consumable in `qconf -mc`? What is the 
>>>>>> default value specified there for it?
>>>>>> 
>>>>>> You check with `qhost -F h_vmem` and the values are not right?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> the h_vmem attached to queue will make sure that job won't use memory 
>>>>>>> more than 18GB.  When the second job comes in, it will be sent to node 
>>>>>>> h2 because there is no enough h_vmem on node h1 left.
>>>>>>> 
>>>>>> ...as the value was subtracted on a host level.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> I am not sure if I am correct about the h_vmem as I have an impression 
>>>>>>> h_vmem won't stop jobs from being sent to a node but virtual_free does. 
>>>>>>>  Any suggestions?
>>>>>>> 
>>>>>> Keep in mind, the "h_vmem" is a hard limit, while "virtual_free" is a 
>>>>>> hint for SGE how to distribute jobs while it allows to consume more than 
>>>>>> requested. It depends on the workflow what fits best.
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Questions about h_vmem

Reply via email to