Hi all,
This is still happening to me, running latest OGS:
scg3-0-2 linux-x64 32 14.72 63.0G 5.7G 9.8G
26.4M
Host Resource(s): hc:h_vmem=12.375G
scg3-0-20 linux-x64 32 16.12 63.0G 3.5G 9.8G
23.5M
Host Resource(s): hc:h_vmem=-23.906G
scg3-0-21 linux-x64 32 13.95 63.0G 8.8G 9.8G
19.5M
Host Resource(s): hc:h_vmem=-15.906G
scg3-0-22 linux-x64 32 13.21 63.0G 5.0G 9.8G
24.1M
Host Resource(s): hc:h_vmem=-15.906G
scg3-0-23 linux-x64 32 12.81 63.0G 8.4G 9.8G
27.8M
Host Resource(s): hc:h_vmem=1.000G
Is there anything I can do to diagnose this issue?
On 10/31/12 3:19 PM, Dave Love wrote:
Alex Chekholko <[email protected]> writes:
Hi Reuti,
Thanks for your response, here's the output of 'qhost -F h_vmem'.
I am not sure how to interpret the negative values here either.
You can get over-subscription of hosts from contributions to parallel
job resources from multiple queues, but there's also at least one bug
producing such symptoms. If I recall correctly, Reuti has some
diagnosis in the issue tracker, to do with multiple resource requests.
An RQS with a dynamic limit may work around it.
--
Alex Chekholko [email protected] 347-401-4860
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users