Can you remove them temporarily? I saw cases where suddenly the "unknown resource" popped up - and also suddenly vanished again, but it was somehow connected to RQS was my conclusion.
-- Reuti > Am 23.01.2015 um 00:16 schrieb Ilya M <[email protected]>: > > There are two RQS, one is disabled: > > { > name limit_for_interns > description "limit to max 5 GPU jobs per intern." > enabled TRUE > limit users {int1,int2} hosts @gpu to slots=5 > } > { > name limit_slots > description NONE > enabled FALSE > limit hosts {@gpu} to slots=2 > } > > > -------- Original Message -------- > Subject: Re: [gridengine users] Cannot request resource if it is a load value > of memory type: SGE reports it as unknown resource > From: Reuti <[email protected]> > To: Ilya <[email protected]> > Date: 1/21/15, 16:12 >> Hi, >> >> Am 22.01.2015 um 00:52 schrieb Ilya: >> >>> Something happened to the SGE (6.2u5) that had been running fine for many >>> months, and users can no longer put resource requests for load values if >>> they are of memory type, e.g. >>> >>> qsub -l mem_free=5G -w v .... produces the following output: >>> >>> cannot run in queue "gpu.q@gpu038" because job requests unknown resource >>> (mem_free) >>> >>> The resource is available, though, when querying for it: >>> qhost -F mem_free -h gpu038 >>> HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO >>> SWAPUS >>> ------------------------------------------------------------------------------- >>> global - - - - - - - >>> gpu038 lx24-amd64 16 2.11 126.1G 15.7G >>> 4.0G 0.0 >>> Host Resource(s): hl:mem_free=110.416G >>> >>> >>> This was first reported by a user when he tried to request custom "hl" >>> resource. However, it now appears that all "hl" resources of type "memory" >>> show this behavior. Integer "hl" are OK. >> Do you have any RQS in place? >> >> -- Reuti >> >> >>> I bounced qmaster between master and shadow-master a couple of times, but >>> it did not resolve the problem. >>> >>> Additionally, when I added MONITOR=1 to scheduler's configuration, the file >>> $SGE_ROOT/$SGE_CELL/common/schedule contains only colons: >>> :::::::: >>> :::::::: >>> :::::::: >>> >>> Any ideas? >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
