Am 06.02.2012 um 10:55 schrieb William Hay: > We discovered a host where the infiniband connection was playing up. > Our normal procedure for this is to remove the host from the > hostgroups it is normally in and add it to a hostgroup associated with > queues that only > accept single node jobs (ie serial jobs and PEs with an allocation > method of $pe_slots) until we can investigate. However rather than > removing it from our parallel queues this mysteriously caused queue > instances for every configured queue in the cluster to appear on the > host (as evidenced by the output of qstat -f). > i)Neither the hostgroup nor the host are referenced directly or > indirectly from any queues bar the single node ones AFAICT.
Somehow this sounds like one hostgroup is included in another one. This is not the case? -- Reuti > ii)Removing the host from the hostgroup causes the queue instances to > dissapear. Adding it back causes them to reappear. > iii)Other hosts in this hostgroup have only the queues they are supposed to. > > William > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
