Forgive my ignorance in resurrecting this, but isn't the primary purpose of a VM to partition system resources?

Is this "noisy neighbor" problem a side effect of using VM's, or just bad VM management on the part of the host? This whole Cloud idea seems pretty pointless if a single VM is able to consume CPU time in the same way a normal process does. You might as well just give everyone a user account and let them fight over a shared pool of RAM.....


Just curious to know if the path forward in a situation like this is to blame the configuration or the technology itself. Would this be any different on a different hypervisor...etc.


On 4/25/19 1:17 PM, VY wrote:
Hi Aaron

Thanks for confirming.  I do not yet know how to troubleshoot in a Xen env
but now that you
dissected the data with me (which no one on our team has so far), I
understand the situation now.

There's no much I can do, and we do not have another place to migrate this
image.
Oh well....

thanks again!

-v


On Thu, Apr 25, 2019 at 1:05 PM Aaron Burt <[email protected]> wrote:

On 2019-04-25 10:54, VY wrote:
Yes, I love to learn as well.

This is the output to lscpu:
   Architecture:          x86_64
[...]
Hypervisor vendor:     Xen
Virtualization type:   full
Ah-hah.  You're in a VM, and I'll bet you have a "noisy neighbor."

The load average is:
load average: 464.68, 415.14, 416.96
which does not make sense at all.
Loadavg is just how many processes are waiting to use the CPU.

The rest of TOP:
   Cpu(s): 51.3%us, 16.0%sy,  0.0%ni, 32.0%id,  0.0%wa,  0.0%hi,
0.4%si,
0.2%st

If I hit 1, it affects all 4 CPUs.
All good.  Very much looks like a "noisy neighbor" problem, which is
when another VM on the hypervisor is hogging all the CPU (or RAM) and
leaving you with no compute resources.  From your VM's perspective, it's
going at the rated clock speed, but time is going by REALLY FAST.

Can you elaborate on why
  >    apicid : 25
   initial apicid : 25
25 is a weird number?   From an earlier thread, is this simply a
logical
ID?
Eh, sort of.  There should only be a couple APICs in the system.  And
usually it'll be pretty consistent.
   But since it's a VM all bets are off.

All the other systems are reporting this number as 4 and all of them
are
having reasonable load.
They're on a different hypervisor machine, and probably a different
version of the hypervisor software.

I do not have root access nor sudo.  I want to try and find out why
the load is so high before I escalate and argue for more privilege.
When I brought this up to the responsible team, I was given a probable
cause -- There are other activities hosting this VM server and they are
causing this issue.
So they already told you that you have a noisy neighbor.  All right
then.  Don't use that VM and get by on the 3 you have, or ask the team
to migrate your slow VM to a different hypervisor machine, or ask for a
new VM on a less loaded hypervisor machine.  But unless the noisy
neighbor calms down it sounds like your one sad VM isn't getting any
better.

Is this a customer-facing service?  If so, you should point this out to
your hosting team.

Good luck,
    Aaron
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to