On Dec 14, 2011, at 1:51 PM, Oleksandr Moskalenko wrote:
> It seems that even under very light i.e. just a couple of long-running jobs
> present load the job runner slowly, but steadily (1-3Mb per sec) grows its
> memory consumption until it's killed by linux OOM killer.
> My job runner config: http://pastebin.com/vMWDHAQm
> I'm currently restarting the runner out of crontab when it is killed by OOM,
> but it's not a sensible solution by any means.
> I wonder if anyone encountered this and how it was solved.
I believe this is a leak either in pbs_python or libtorque.so. I haven't yet
been able to track down the culprit, so in the meantime, we simply restart the
job runner process once it reaches a specified amount of memory usage.
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at: