Am 06.02.2012 um 10:55 schrieb William Hay:

> We discovered a host where the infiniband connection was playing up.
> Our normal procedure for this is to remove the host from the
> hostgroups it is normally in and add it to a hostgroup associated with
> queues that only
> accept single node jobs (ie serial jobs and PEs with an allocation
> method of $pe_slots) until we can investigate.  However rather than
> removing it from our parallel queues this mysteriously caused queue
> instances for every configured queue in the cluster to appear on the
> host (as evidenced by the output of qstat -f).
> i)Neither the hostgroup nor the host are referenced directly or
> indirectly from any  queues bar the single node ones AFAICT.

Somehow this sounds like one hostgroup is included in another one. This is not 
the case?

-- Reuti


> ii)Removing the host from the hostgroup causes the queue instances to
> dissapear.  Adding it back causes them to reappear.
> iii)Other hosts in this hostgroup have only the queues they are supposed to.
> 
> William
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to