On Mon, Feb 23, 2009 at 01:33:05AM +0100, Aurelien Jarno wrote:
> Hi,
>
> Since kvm-81, I have noticed that GNU/kFreeBSD 32-bit guest are crashing
> under high load (during a compilation for example) with the following
> error message:
>
> | Fatal trap 12: page fault while in kernel mode
> | fault virtual address = 0x4
> | fault code = supervisor read, page not present
> | instruction pointer = 0x20:0xc0a4fc00
> | stack pointer = 0x28:0xe66d7a70
> | frame pointer = 0x28:0xe66d7a80
> | code segment = base 0x0, limit 0xfffff, type 0x1b
> | = DPL 0, pres 1, def32 1, gran 1
> | processor eflags = interrupt enabled, resume, IOPL = 0
> | current process = 24037 (bash)
> | trap number = 12
> | panic: page fault
> | Uptime: 4m7s
> | Cannot dump. No dump device defined.
> | Automatic reboot in 15 seconds - press a key on the console to abort
>
> I haven't tried yet with a plain FreeBSD guest, but I also expect it to
> crash given the kernel (version 7.1) is almost the same. A closer
> investigation has shown that the following commit is causing the
> problem:
>
> | commit 6364a3918cb5c28376849e7fca3e09bd66b859f3
> | Author: Marcelo Tosatti <[email protected]>
> | Date: Mon Dec 1 22:32:04 2008 -0200
> |
> | KVM: MMU: skip global pgtables on sync due to cr3 switch
> |
> | Skip syncing global pages on cr3 switch (but not on cr4/cr0). This is
> | important for Linux 32-bit guests with PAE, where the kmap page is
> | marked as global.
> |
> | Signed-off-by: Marcelo Tosatti <[email protected]>
> | Signed-off-by: Avi Kivity <[email protected]>
>
> As expected, loading the KVM module with oos_shadow=0 workaround the
> problem. Please note that the guest is running in 32-bit mode, does not
> use PAE, and uses global pages. My host has an Intel Q9450 CPU, and the
> problem appears with both a 2.6.26 and a 2.6.28 64-bit kernel.
>
> Does anybody see any problem in this patch? How can I further
> debug the problem?
Aurelien,
Maybe there is a bug in the syncing code (eg: not all global pages are
sync'ed when the OS requests a global sync), or FreeBSD is "relying" on
invlpg/cr3 write to sync global pages (remember TLB entries can be
invalidated internally by CPU).
If you want to debug it, would suggest looping over all MMU pages in
mmu_sync_global, after the kvm_sync_page loop, and
WARN_ON(sp->unsync && sp->global);
If that fails, check if the unsync and global flags mean what they are
supposed to.
Sorry for the trouble and thanks for the detailed report, will take a
close look at it this week.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html