On Tue, 16 Jun 2015 16:36:07 +0000 Notorious Biggles <notoriousbigg...@gmail.com> wrote:
> The situation I have now is that whilst all my Cluster Queues as shown in > qstat and in qmon (divided up as short, medium and long) are still there, the > Queue Instances have disappeared for everything except the long queue. I > tried to modify the Cluster Queues for say the short queue and all the > hostlists were present as I'd expect. In qmon, it just shows broken queues as > all zeros - zero in use, zero avail, zero total, zero in error, CQLOA of > -NA-. I dug about in the filesystem to see if I'd lost files, but the > spool/qinstance/medium/nodexx.cluster type files are all present and readable > - just seems like GE is ignoring them (although I'm not sure if loss of them > would have caused this behaviour). > > I found by messing around that if I cloned the short Cluster Queue via qmon > to create a short2 queue, it would populate the Queue Instances correctly and > I'd have my usual number of total slots and the short2 queue appeared to work > fine and dandy. > > So my questions: > - Any ideas why my GE lost the Queue Instances? > - Is there an easier way to get them back? (Not that cloning a Cluster Queue > is difficult, but if there's a more "correct" way to do it, then I'd rather > know.) > - Is there a qconf equivalent of qmon's Clone button? > > I'm a bit out of my depth with this and my google-fu seems to be letting me > down. I don't know but I'm somewhat suspicious of the code dealing with hostgroups in 6.2u3: https://arc.liv.ac.uk/trac/SGE/ticket/1439 If you are using hostgroups you could try creating copy hostgroups and switching the queues to point at them. This would mean you could keep the more user visible queue names the same. -- William Hay <w....@ucl.ac.uk>
pgp8P2Snajj3e.pgp
Description: PGP signature
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users