I know this is an old topic. I'm catching up on months' worth of mailing list mail right now.

On 09/17/2017 09:09 PM, Christopher Samuel wrote:
On 15/09/17 04:45, Prentice Bisbal wrote:

I'm happy to announce that I finally found the cause this problem: numad.
Very interesting, it sounds like it was migrating processes onto a
single core over time!  Anything diagnostic in its log?

That's exactly what it was doing. No, I did not see any diagnostics in the log files, but in some of the documentation I read on numad at the time, it stated that numad is not good to have enabled for large multi-core  jobs that use a lot of memory, like DB servers and HPC jobs.

--
Prentice

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to