Hi all,

This is still happening to me, running latest OGS:

scg3-0-2 linux-x64 32 14.72 63.0G 5.7G 9.8G 26.4M
    Host Resource(s):      hc:h_vmem=12.375G
scg3-0-20 linux-x64 32 16.12 63.0G 3.5G 9.8G 23.5M
    Host Resource(s):      hc:h_vmem=-23.906G
scg3-0-21 linux-x64 32 13.95 63.0G 8.8G 9.8G 19.5M
    Host Resource(s):      hc:h_vmem=-15.906G
scg3-0-22 linux-x64 32 13.21 63.0G 5.0G 9.8G 24.1M
    Host Resource(s):      hc:h_vmem=-15.906G
scg3-0-23 linux-x64 32 12.81 63.0G 8.4G 9.8G 27.8M
    Host Resource(s):      hc:h_vmem=1.000G

Is there anything I can do to diagnose this issue?

On 10/31/12 3:19 PM, Dave Love wrote:
Alex Chekholko <[email protected]> writes:

Hi Reuti,

Thanks for your response, here's the output of 'qhost -F h_vmem'.
I am not sure how to interpret the negative values here either.

You can get over-subscription of hosts from contributions to parallel
job resources from multiple queues, but there's also at least one bug
producing such symptoms.  If I recall correctly, Reuti has some
diagnosis in the issue tracker, to do with multiple resource requests.
An RQS with a dynamic limit may work around it.


--
Alex Chekholko [email protected] 347-401-4860
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to