Thanks for the help. I guess part of my original question was 'will
h_vmem help the scheduler to hold off the job if the node does not have
enough h_vmem left?'
Say, we have
* a consumable h_vmem (qconf -mc) with default value 4GB,
* the exec host h1 and h2 both have h_vmem = 32GB (qconf -me),
* the queue a.q is configured with 18GB h_vmem (qconf -mq).
What happens a user sends 3 jobs to a.q, assuming there are more than
two slots on each of the host ? -- will
* 3 jobs get to run simultaneously?
* or there is a job has to be held off ? (because h_vmem on each of
the host will decrease to 32-18=14G, not enough for the third job)
On 04/07/2014 11:29 AM, Reuti wrote:
Hi,
Am 07.04.2014 um 17:10 schrieb Fan Dong:
I am a little confused about the consumable h_vmem setup on the node and the
queue. Let's say we have one queue, called a.q, spans two host, h1 and h2. h1
has 32GB of ram and h2 has 128GB.
I attached h_vmem to both hosts, using the value of actual physical ram,
You defined this value `qconf -me ...` => "complex_values"?
also a.q has default h_vmem value of 18GB, which is the peak memory usage of
the job.
Yes, the setting in the queue is per job, while in the exechost definition it's
across all jobs.
Here is how I understand the way h_vmem works. When the first job in a.q is
sent to node h1, the h_vmem on the node will decrease to 32-18=14GB,
Did you make the "h_vmem" complex consumable in `qconf -mc`? What is the
default value specified there for it?
You check with `qhost -F h_vmem` and the values are not right?
the h_vmem attached to queue will make sure that job won't use memory more than
18GB. When the second job comes in, it will be sent to node h2 because there
is no enough h_vmem on node h1 left.
...as the value was subtracted on a host level.
I am not sure if I am correct about the h_vmem as I have an impression h_vmem
won't stop jobs from being sent to a node but virtual_free does. Any
suggestions?
Keep in mind, the "h_vmem" is a hard limit, while "virtual_free" is a hint for
SGE how to distribute jobs while it allows to consume more than requested. It depends on the
workflow what fits best.
-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users