Am 19.10.2012 um 20:58 schrieb Alex Chekholko:

> qhost values seem fine:
> 
> ...
> scg3-0-11               lx26-amd64     32 27.15   63.0G   38.3G    9.8G  
> 393.6M
> scg3-0-12               lx26-amd64     32 27.36   63.0G   38.7G    9.8G   
> 33.6M
> scg3-0-13               lx26-amd64     32 22.61   63.0G   24.4G    9.8G   
> 31.5M
> ...
> 
> When I submit a job as myself with such a memory request, it doesn't get 
> dispatched, just sits in 'qw'.

And:

qhost -F h_vmem

The limit wasn't defined after the job already started?

-- Reuti


> Regards,
> Alex
> 
> On 10/18/12 7:41 PM, Rayson Ho wrote:
>> Alex,
>> 
>> Can you run qhost and see if the memory value is also negative also??
>> If it is, then this bug was fixed in any release of OGS/GE.
>> 
>> Rayson
>> 
>> 
>> 
>> On Thu, Oct 18, 2012 at 6:53 PM, Alex Chekholko <[email protected]> wrote:
>>> Hi,
>>> 
>>> Running Rocks 6, so whatever GE version is included there.
>>> 
>>> h_vmem is set consumable and per job, 4G default:
>>> 
>>> -bash-4.1$ qconf -sc |grep h_vmem
>>> h_vmem              h_vmem     MEMORY      <=    YES         JOB 4G       0
>>> 
>>> each exec host has an h_vmem attribute set:
>>> -bash-4.1$ qconf -se scg3-0-11 |grep h_vmem
>>> complex_values        slots=16,h_vmem=60G
>>> 
>>> pe "shm" is defined;
>>> -bash-4.1$ qconf -sp shm
>>> pe_name            shm
>>> slots              999
>>> user_lists         NONE
>>> xuser_lists        NONE
>>> start_proc_args    NONE
>>> stop_proc_args     NONE
>>> allocation_rule    $pe_slots
>>> control_slaves     FALSE
>>> job_is_first_task  TRUE
>>> urgency_slots      min
>>> accounting_summary FALSE
>>> 
>>> A user is submitting a job with '-pe shm -l h_vmem=120G', and it's getting
>>> dispatched to a host that has h_vmem=60G defined.  How is that possible?
>>> 
>>> And qstat reports negative h_vmem values, e.g.:
>>> -bash-4.1$ qstat -f -u '*' -F h_vmem
>>> ...
>>> [email protected]          BIP   0/16/16        12.12    lx26-amd64
>>>         hc:h_vmem=-80.000G
>>>   88866 0.50500 mCSRR57762 yxl          r     10/18/2012 09:17:21     1
>>>   89094 0.60500 G_ordermar elisaz       r     10/18/2012 15:03:39    15
>>> ...
>>> 
>>> Maybe the sgeexecd needs to be cycled for the setting to take effect?  I can
>>> try that next.
>>> 
>>> Regards,
>>> --
>>> Alex Chekholko [email protected]
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to