Hi all,

I have what seems like a straightforward problem.

This is on Open Grid Scheduler 2011.11...

h_vmem is configured as consumable:
# qconf -sc | grep  h_vmem
h_vmem h_vmem MEMORY <= YES JOB 4G 0

The exec host is configured to have 46G of h_vmem:
# qconf -se scg1-4-8 | grep h_vmem
complex_values        h_vmem=46G,slots=12

The user requests 16G of h_vmem in his job:
# qstat -f -j 480039 |grep h_vmem
hard resource_list:         h_vmem=16G



But the scheduler puts a whole bunch of these on the same node!

# qhost -j -F h_vmem
...

scg1-4-8 linux-x64 24 - 47.3G - 9.8G -
    Host Resource(s):      hc:h_vmem=-130.000G
480039 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480041 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480042 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480043 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480044 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480045 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480046 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480047 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480048 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480049 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER 480050 0.50074 a_STL002-O shinlin r 01/08/2013 20:21:18 standard@s MASTER



How do I go about troubleshooting this? It seems to have been working fine for a while (months), it just started doing this a few days ago.

I did have the same problem earlier, but as I understand it, there is a bug related to either multiple queue instances on a host or multiple consumable requests for the job, but in this case it is neither. The host only has this one queue instance, and the job only requests this one complex.

Suggestions?

Regards,
--
Alex Chekholko [email protected]
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to