On 06/02/13 10:24, Mark Dixon wrote:
On Tue, 5 Feb 2013, Orlando Richards wrote:
...
The bug has popped up a few times in the past on the mailing lists, but
we haven't found any real resolution - other than people switching from
fair share to functional share (which we are loathe to do).
Any advice on how we could set about tackling this bug?
...
Hi Orlando,
Great to hear you're game :)
Are you after advice on debugging gridengine, or pointers for this bug
specifically?
Hi Mark,
Both - plus open to anyone offering up a solution where I have to do
nothing ;-)
I've had a go at digging through the code, but couldn't really make head
nor tail of it - no doubt in large part due to my not being much of a
coder :( Any pointers to get me bootstrapped would be most welcome.
At the moment, I'm trying to get a reproducible test case together to
allow for useful debugging - basic tests (sleep 60s) don't show an
obvious triggering of the issue, so I'm moving onto more complicated
tasks. Certainly, the issue does seem to create orders-of-magnitude
differences in reported usage. Current offenders include BLAST jobs (run
by our Biology users) - which are fairly memory heavy.
--
--
Dr Orlando Richards
Information Services
IT Infrastructure Division
Unix Section
Tel: 0131 650 4994
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users