The virtual machine running the LUV server was killed by the kernel OOM 4 
hours ago.  I didn't immediately notice because the VM running my Jabber 
server (which notifies me of system problems) was also killed).

When we had the last problem I converted the virtual machines from Xen to KVM.  
With KVM the VMs are regular Linux processes and they share the same memory as 
regular processes.  So if another process allocates too much RAM then it may 
cause KVM memory allocation to fail.  Also if the entire system runs out of 
RAM the kernel may stupidly decide to kill the KVM instance instead of 
something else.

I think that part of the problem was that BOINC was configured to use up to 
90% of system RAM.  That was an OK setting for a Xen server where the Dom0 had 
nothing of note running other than BOINC and the virtual machines had RAM 
reserved.  When running KVM this wasn't a suitable setting.

I've configured BOINC to only use 40% of RAM and increased swap size.
This shouldn't happen again.

Also I'm going to move the Jabber server to the Dom0 so that if the DomUs die 
then I can still get alerts.

My Main Blog
My Documents Blog

luv-main mailing list

Reply via email to