Re: Strange amounts of paging reported in VM at 4:00 AM.

Rob van der Heij Fri, 01 Sep 2006 05:04:20 -0700

On 9/1/06, John Summerfied <[EMAIL PROTECTED]> wrote:

is a dormant Linux guest entirely dormant? I expect not, I think there
are always daemons checking for timeouts, bits of the kernel running
round checking whether it can reassign unused memory, whether there are
buffers to flush, whether any hardware (including power supply) has
failed yet.


You're right, there remains a certain amount of background noise that
you could call overhead. Once you have the 100 Hz timer off, you only
go back for requested wake-up calls by kernel threads and processes. I
did some work on the 2.4 kernel to identify and minimize those. I
could get down to some 50 timer interrupts per minute, but the neat
thing is that many of those aligned to the wall clock time, so you
could have periods of almost 5 seconds without timer interrupts (iirc
the 5 second was bdflush).
I was a bit surprised to find different applications implement their
own timer again based on a steady interrupt. When I looked, nscd was
doing several of 1 and 2 seconds and apache used a 1 second rithm.
Sure, it's not as bad as 10 ms but it does limit scalability in the
end. And we also found some middleware had implemented a 50 ms (!)
timer in a Java class to run their scheduling :-(  The developer I
talked to did not even understand my concerns about that.

In this light, the 20-minute mark does not bother me a lot from
impacting the dormant guest. But there's another effect due to the
continuous writing to disk. Not only will it cause subsequent activity
to flush dirty pages, it also causes the "hot spot" to slowly migrate
through the large virtual machine and cause associated paging in VM.

It really gets bad with active agents for monitoring or performance
data collection. I once measured that the "idle" agents of a security
compliance monitor (just sitting there waiting for the server to
request a scan) used 10 times what the entire idle server used without
them.
A data gatherer using "only" 1-2% of a CPU means that 50 idle servers
take up an entire CPU (plus whatever other resources when they keep
the archive on the individual servers). That made some folks only
monitor the production servers. This however is a bad suggestion
because you would have no way to explain excessive usage in your
development servers. For a large installation it is not unrealistic
that you would find more than 25% of your capacity used by servers
doing "nothing" or other things that could be avoided. That's sad when
you realize that you could have used that capacity to run your
business applications.

Rob
--
Rob van der Heij
Velocity Software, Inc
http://velocitysoftware.com/

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Strange amounts of paging reported in VM at 4:00 AM.

Reply via email to