Mike Fischer <fischer+o...@lavielle.com> writes:

> I have been observing occasional bouts of high load averages on
> several servers I administer and I am trying to find the cause. (I
> monitor these machines so that I can implement corrective measures in
> case of any malicious or abnormal activity. I think this is benign,
> but I’d still like to find the cause.)
>
> Once the high load average starts, only a reboot seems to (temporarily) 
> return the values to their normal levels.
>
> The actual CPU usage (as measured by vmstat) stays low even if the load 
> average is elevated.
>
> The servers are VMs running on a VMWare host (ESXi). This was seen with 
> OpenBSD 7.3 and 7.4 amd64.
>
> I can not determine anything inside the VM that causes this. There
> seems to be no correlation to pfstat(8) graphs, log entries, known
> events, or anything else I can determine. restarting all of the rc.d
> services never made any difference.
>
> Could this be caused by something on the VMWare host machine? (The
> host seems to be operating at limit regarding RAM for example. But the
> VM is only using the normal percentage of its allocated RAM — way
> below 100% and very constant usage, no swap.)
>
> How can I further debug this, keeping in mind that these are production 
> machines and experimentation is limited to benign things that don’t cause 
> outages.
>

Can you share a dmesg of one of the 7.4 vm? The output of `vmstat -iz`
might help narrow it down to a stuck interrupt. Also, try running
systat(1) and observe things as they happen.

Reply via email to