Can you remove them temporarily? I saw cases where suddenly the "unknown 
resource" popped up - and also suddenly vanished again, but it was somehow 
connected to RQS was my conclusion.

-- Reuti


> Am 23.01.2015 um 00:16 schrieb Ilya M <[email protected]>:
> 
> There are two RQS, one is disabled:
> 
> {
>   name         limit_for_interns
>   description  "limit to max 5 GPU jobs per intern."
>   enabled      TRUE
>   limit        users {int1,int2} hosts @gpu to slots=5
> }
> {
>   name         limit_slots
>   description  NONE
>   enabled      FALSE
>   limit        hosts {@gpu} to slots=2
> }
> 
> 
> -------- Original Message --------
> Subject: Re: [gridengine users] Cannot request resource if it is a load value 
> of memory type: SGE reports it as unknown resource
> From: Reuti <[email protected]>
> To: Ilya <[email protected]>
> Date: 1/21/15, 16:12
>> Hi,
>> 
>> Am 22.01.2015 um 00:52 schrieb Ilya:
>> 
>>> Something happened to the SGE (6.2u5) that had been running fine for many 
>>> months, and users can no longer put resource requests for load values if 
>>> they are of memory type, e.g.
>>> 
>>> qsub -l mem_free=5G -w v .... produces the following output:
>>> 
>>> cannot run in queue "gpu.q@gpu038" because job requests unknown resource 
>>> (mem_free)
>>> 
>>> The resource is available, though, when querying for it:
>>> qhost -F mem_free -h gpu038
>>> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE SWAPTO  
>>> SWAPUS
>>> -------------------------------------------------------------------------------
>>> global                  -               -     -       - -       -       -
>>> gpu038                         lx24-amd64     16  2.11  126.1G 15.7G    
>>> 4.0G     0.0
>>>    Host Resource(s):      hl:mem_free=110.416G
>>> 
>>> 
>>> This was first reported by a user when he tried to request custom "hl" 
>>> resource. However, it now appears that all "hl" resources of type "memory" 
>>> show this behavior. Integer "hl" are OK.
>> Do you have any RQS in place?
>> 
>> -- Reuti
>> 
>> 
>>> I bounced qmaster between master and shadow-master a couple of times, but 
>>> it did not resolve the problem.
>>> 
>>> Additionally, when I added MONITOR=1 to scheduler's configuration, the file 
>>> $SGE_ROOT/$SGE_CELL/common/schedule contains only colons:
>>> ::::::::
>>> ::::::::
>>> ::::::::
>>> 
>>> Any ideas?
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to