On 06/02/13 10:24, Mark Dixon wrote:
On Tue, 5 Feb 2013, Orlando Richards wrote:
...
The bug has popped up a few times in the past on the mailing lists, but
we haven't found any real resolution - other than people switching from
fair share to functional share (which we are loathe to do).

Any advice on how we could set about tackling this bug?
...

Hi Orlando,

Great to hear you're game :)

Are you after advice on debugging gridengine, or pointers for this bug
specifically?

Hi Mark,

Both - plus open to anyone offering up a solution where I have to do nothing ;-)

I've had a go at digging through the code, but couldn't really make head nor tail of it - no doubt in large part due to my not being much of a coder :( Any pointers to get me bootstrapped would be most welcome.

At the moment, I'm trying to get a reproducible test case together to allow for useful debugging - basic tests (sleep 60s) don't show an obvious triggering of the issue, so I'm moving onto more complicated tasks. Certainly, the issue does seem to create orders-of-magnitude differences in reported usage. Current offenders include BLAST jobs (run by our Biology users) - which are fairly memory heavy.

--
            --
   Dr Orlando Richards
  Information Services
IT Infrastructure Division
       Unix Section
    Tel: 0131 650 4994

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to