I've just renamed some of out hosts and moved them from one hostgroup
to another in order to clearly separate nodes with and without GPUs.

However one node has grown some queues for which I can see no
explanation.  Here is an example:
[root@admin03 queues]# qconf -sq Agatha|grep hostlist
hostlist              @LO-a @HI-a @INT-W
[root@admin03 queues]# qconf -shgrp '@LO-a'
group_name @LO-a
hostlist node-a03.data.legion.ucl.ac.uk node-a04.data.legion.ucl.ac.uk \
         node-a05.data.legion.ucl.ac.uk node-a06.data.legion.ucl.ac.uk \
         node-a09.data.legion.ucl.ac.uk node-a10.data.legion.ucl.ac.uk \
         node-a11.data.legion.ucl.ac.uk node-a12.data.legion.ucl.ac.uk \
         node-a13.data.legion.ucl.ac.uk node-a14.data.legion.ucl.ac.uk \
         node-a15.data.legion.ucl.ac.uk node-a16.data.legion.ucl.ac.uk \
         node-a17.data.legion.ucl.ac.uk node-a18.data.legion.ucl.ac.uk \
         node-a19.data.legion.ucl.ac.uk node-a20.data.legion.ucl.ac.uk \
         node-a21.data.legion.ucl.ac.uk node-a22.data.legion.ucl.ac.uk \
         node-a23.data.legion.ucl.ac.uk node-a24.data.legion.ucl.ac.uk \
         node-a25.data.legion.ucl.ac.uk node-a26.data.legion.ucl.ac.uk \
         node-a27.data.legion.ucl.ac.uk node-a28.data.legion.ucl.ac.uk \
         node-a29.data.legion.ucl.ac.uk node-a30.data.legion.ucl.ac.uk \
         node-a31.data.legion.ucl.ac.uk node-a32.data.legion.ucl.ac.uk
[root@admin03 queues]# qconf -shgrp '@HI-a'
group_name @HI-a
hostlist node-a33.data.legion.ucl.ac.uk node-a36.data.legion.ucl.ac.uk \
         node-a37.data.legion.ucl.ac.uk node-a38.data.legion.ucl.ac.uk \
         node-a39.data.legion.ucl.ac.uk node-a40.data.legion.ucl.ac.uk \
         node-a41.data.legion.ucl.ac.uk node-a42.data.legion.ucl.ac.uk \
         node-a44.data.legion.ucl.ac.uk node-a45.data.legion.ucl.ac.uk \
         node-a46.data.legion.ucl.ac.uk node-a47.data.legion.ucl.ac.uk \
         node-a48.data.legion.ucl.ac.uk node-a49.data.legion.ucl.ac.uk \
         node-a50.data.legion.ucl.ac.uk node-a51.data.legion.ucl.ac.uk \
         node-a52.data.legion.ucl.ac.uk
[root@admin03 queues]# qconf -shgrp '@INT-W'
group_name @INT-W
hostlist usertest01.data.legion.ucl.ac.uk usertest02.data.legion.ucl.ac.uk \
         usertest03.data.legion.ucl.ac.uk usertest04.data.legion.ucl.ac.uk \
         usertest05.data.legion.ucl.ac.uk usertest06.data.legion.ucl.ac.uk \
         usertest07.data.legion.ucl.ac.uk usertest08.data.legion.ucl.ac.uk \
         usertest09.data.legion.ucl.ac.uk usertest10.data.legion.ucl.ac.uk \
         usertest11.data.legion.ucl.ac.uk
[root@admin03 queues]# qstat -f |grep Agatha@node-301
[email protected] BIP   0/0/12         0.00     lx26-amd64    d

As you can see from the above the hostlist for the Agatha queue lists three
hostgroups none of which contain node-301.  Yet node-301 has an Agatha queue
(no running jobs in it so it isn't just a temporary holdover).

The host didn't run this queue under it's old name either.

I've disabled all the spurious queues (other things should stop things from
running inappropriately anyway) but I'd still like to know why they appeared and
get rid of them.

William
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to