The short answer is that I am still see large system time hiccups in the
guests due to kscand in the guest scanning its active lists. I do see
better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
completeness I also tried a history of 2, but it performed worse than 3
which is no surprise given the meaning of it.)
I have been able to scratch out a simplistic program that stimulates
kscand activity similar to what is going on in my real guest (see
attached). The program requests a memory allocation, initializes it (to
get it backed) and then in a loop sweeps through the memory in chunks
similar to a program using parts of its memory here and there but
eventually accessing all of it.
Start the RHEL3/CentOS 3 guest with *2GB* of RAM (or more). The key is
using a fair amount of highmem. Start a couple of instances of the
attached. For example, I've been using these 2:
memuser 768M 120 5 300
memuser 384M 300 10 600
Together these instances take up a 1GB of RAM and once initialized
consume very little CPU. On kvm they make kscand and kswapd go nuts
every 5-15 minutes. For comparison, I do not see the same behavior for
an identical setup running on esx 3.5.
david
Avi Kivity wrote:
> Avi Kivity wrote:
>>
>> There are (at least) three options available:
>> - detect and special-case this scenario
>> - change the flood detector to be per page table instead of per vcpu
>> - change the flood detector to look at a list of recently used page
>> tables instead of the last page table
>>
>> I'm having a hard time trying to pick between the second and third
>> options.
>>
>
> The answer turns out to be "yes", so here's a patch that adds a pte
> access history table for each shadowed guest page-table. Let me know if
> it helps. Benchmarking a variety of workloads on all guests supported
> by kvm is left as an exercise for the reader, but I suspect the patch
> will either improve things all around, or can be modified to do so.
>
/* simple program to malloc memory, inialize it, and
* then repetitively use it to keep it active.
*/
#include <sys/time.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <libgen.h>
/* goal is to sweep memory every T1 sec by accessing a
* percentage at a time and sleeping T2 sec in between accesses.
* Once all the memory has been accessed, sleep for T3 sec
* before starting the cycle over.
*/
#define T1 180
#define T2 5
#define T3 300
const char *timestamp(void);
void usage(const char *prog) {
fprintf(stderr, "\nusage: %s memlen{M|K}) [t1 t2 t3]\n", prog);
}
int main(int argc, char *argv[])
{
int len;
char *endp;
int factor, endp_len;
int start, incr;
int t1 = T1, t2 = T2, t3 = T3;
char *mem;
char c = 0;
if (argc < 2) {
usage(basename(argv[0]));
return 1;
}
/*
* determine memory to request
*/
len = (int) strtol(argv[1], &endp, 0);
factor = 1;
endp_len = strlen(endp);
if ((endp_len == 1) && ((*endp == 'M') || (*endp == 'm')))
factor = 1024 * 1024;
else if ((endp_len == 1) && ((*endp == 'K') || (*endp == 'k')))
factor = 1024;
else if (endp_len) {
fprintf(stderr, "invalid memory len.\n");
return 1;
}
len *= factor;
if (len == 0) {
fprintf(stdout, "memory len is 0.\n");
return 1;
}
/*
* convert times if given
*/
if (argc > 2) {
if (argc < 5) {
usage(basename(argv[0]));
return 1;
}
t1 = atoi(argv[2]);
t2 = atoi(argv[3]);
t3 = atoi(argv[4]);
}
/*
* amount of memory to sweep at one time
*/
if (t1 && t2)
incr = len / t1 * t2;
else
incr = len;
mem = (char *) malloc(len);
if (mem == NULL) {
fprintf(stderr, "malloc failed\n");
return 1;
}
printf("memory allocated. initializing to 0\n");
memset(mem, 0, len);
start = 0;
printf("%s starting memory update.\n", timestamp());
while (1) {
c++;
if (c == 0x7f) c = 0;
memset(mem + start, c, incr);
start += incr;
if ((start >= len) || ((start + incr) >= len)) {
printf("%s scan complete. sleeping %d\n",
timestamp(), t3);
start = 0;
sleep(t3);
printf("%s starting memory update.\n", timestamp());
} else if (t2)
sleep(t2);
}
return 0;
}
const char *timestamp(void)
{
static char date[64];
struct timeval now;
struct tm ltime;
memset(date, 0, sizeof(date));
if (gettimeofday(&now, NULL) == 0)
{
if (localtime_r(&now.tv_sec, <ime))
strftime(date, sizeof(date), "%m/%d %H:%M:%S", <ime);
}
if (strlen(date) == 0)
strcpy(date, "unknown");
return date;
}