Hi Paul,

We've just seen another KVM bug with 3.8 on p7. It looks as if for some reason 
a bolted HTAB entry for the kernel got evicted.

[   16s] booting kvm ...
[   16s] /usr/bin/qemu-system-ppc64 -no-reboot -nographic -vga none -net none 
-enable-kvm -M pseries -cpu host -kernel /boot/vmlinux -initrd /boot/initrd 
-append root=/dev/vda panic=1 quiet no-kvmclock nmi_watchdog=0 rw elevator=noop 
console=hvc0 init=/.build/build -m 3072 -drive 
file=/obs/worker/root_1/root,if=virtio,cache=unsafe -drive 
file=/obs/worker/root_1/root,if=ide,index=0,cache=unsafe -drive 
file=/obs/worker/root_1.swap,if=virtio,cache=unsafe -smp 1
[   16s] 
[   16s] 
[   16s] SLOF 
**********************************************************************
[   16s] QEMU Starting
[   16s]  Build Date = Jun 10 2013 17:00:23
[   16s]  FW Version = git-f564e52f4418d308
[   16s]  Press "s" to enter Open Firmware.

[   16s] 
[   16s] C0000
[   16s] C0100
[   17s] C0120
[   17s] C0140
[   17s] C0200
[   17s] C0201
[   17s] C0220
[   17s] C0240
[   17s] C0260
[   17s] C0270
[   17s] C02E0
[   17s] C0300
[   17s] C0320
[   17s] C0360
[   17s] C0370
[   17s] C0371
[   17s] C0372
[   17s] C0373
[   17s] C0374
[   17s] C0390
[   17s] C03F0
[   17s] C0400
[   17s] C0480
[   17s] C04C0
[   17s] C04D0
[   17s] C0500
[   17s] Populating /vdevice methods
[   17s] Populating /vdevice/vty@71000000
[   18s] Populating /vdevice/nvram@71000001
[   18s] 
[   18s] NVRAM: size=65536, fetch=200E, store=200F
[   18s] Populating /vdevice/v-scsi@71000002
[   18s] VSCSI: Initializing
[   18s] VSCSI: Looking for devices
[   18s]   8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      1.5."
[   18s] C0580
[   18s] C05A0
[   18s] Populating /pci@800000020000000
[   18s]  Adapters on 0800000020000000
[   18s]                      00 0000 (D) : 1af4 1001    virtio [ block ]
[   18s]                      00 0800 (D) : 1af4 1001    virtio [ block ]
[   18s] C0600
[   18s] C0640
[   18s] C0690
[   18s] C06A0
[   18s] C06A8
[   18s] C06B0
[   18s] C06B8
[   18s] C06C0
[   18s] C06E0
[   18s] C0700
[   18s] C0800
[   18s] C0880
[   18s] No NVRAM common partition, re-initializing...
[   18s] C0890
[   18s] C08A0
[   19s] C08A8
[   19s] C08B0
[   19s] C08C0
[   19s] C08D0
[   19s] Using default console: /vdevice/vty@71000000
[   19s] C08E0
[   19s] C08E8
[   19s] Detected RAM kernel at 400000 (16185f0 bytes) C08FF
[   19s]      
[   19s]   Welcome to Open Firmware
[   19s] 
[   19s]   Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
[   19s]   This program and the accompanying materials are made available
[   19s]   under the terms of the BSD License available at
[   19s]   http://www.opensource.org/licenses/bsd-license.php
[   19s] 
[   19s] Booting from memory...
[   19s] OF stdout device is: /vdevice/vty@71000000
[   19s] Preparing to boot Linux version 3.8.0-2-default (geeko@buildhost) (gcc 
version 4.5.0 20100414 [gcc-4_5-branch revision 158342] (SUSE Linux) ) #1 SMP 
Wed Feb 20 02:54:06 UTC 2013 (e252f7f)
[   19s] Detected machine type: 0000000000000101
[   19s] Max number of cores passed to firmware: 1024 (NR_CPUS = 1024)
[   19s] Calling ibm,client-architecture-support... not implemented
[   19s] couldn't open /packages/elf-loader
[   19s] command line: root=/dev/vda panic=1 quiet no-kvmclock nmi_watchdog=0 
rw elevator=noop console=hvc0 init=/.build/build
[   19s] memory layout at init:
[   19s]   memory_limit : 0000000000000000 (16 MB aligned)
[   19s]   alloc_bottom : 0000000001a30000
[   19s]   alloc_top    : 0000000010000000
[   19s]   alloc_top_hi : 00000000c0000000
[   19s]   rmo_top      : 0000000010000000
[   19s]   ram_top      : 00000000c0000000
[   19s] instantiating rtas at 0x000000000dbf0000... done
[   19s] Querying for OPAL presence... not there.
[   19s] boot cpu hw idx 0
[   19s] copying OF device tree...
[   19s] Building dt strings...
[   19s] Building dt structure...
[   19s] Device tree strings 0x0000000001d40000 -> 0x0000000001d40774
[   19s] Device tree struct  0x0000000001d50000 -> 0x0000000001d60000
[   19s] Calling quiesce...
[   19s] returning from prom_init
[20500s] QEMU 1.5.0 monitor - type 'help' for more information
[20500s] (qemu)
[20505s] (qemu) info registers
[20505s] NIP 0000000000000410   LR 0000000000b31240 CTR 0000000000000000 XER 
0000000000000000
[20505s] MSR 8000000000001000 HID0 0000000000000000  HF 0000000000000000 idx 1
[20505s] TB 00000000 00000000 DECR 00000000
[20505s] GPR00 0000000000b31240 c00000000128bde0 0000000001288c40 
ffffffffffffffff
[20505s] GPR04 0000000000000000 c00000000128bcc0 4001438795007015 
0000000070001194
[20505s] GPR08 0000000000000000 0000000022000088 b000000000001032 
c000000000005d00
[20505s] GPR12 8000000040001032 c00000000ff20000 0000000000000000 
0000000000000000
[20505s] GPR16 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] GPR20 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] GPR24 4000000000000000 c000000000000000 c000000000000000 
0000000000f77870
[20505s] GPR28 c000000001132ae8 00000000c0000000 c000000000000000 
c0000000015bfa48
[20505s] CR 22000088  [ E  E  -  -  -  -  L  L  ]             RES 
ffffffffffffffff
[20505s] FPR00 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR04 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR08 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR12 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR16 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR20 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR24 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPR28 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[20505s] FPSCR 0000000000000000
[20505s]  SRR0 c000000000005d00  SRR1 8000000040001032    PVR 00000000003f0201 
VRSAVE 0000000000000000
[20505s] SPRG0 0000000000000000 SPRG1 c00000000ff20000  SPRG2 c00000000ff20000  
SPRG3 0000000000000000
[20505s] SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  
SPRG7 0000000000000000
[20505s]  CFAR 0000000000000000
[20914s] (qemu) pmemsave 0x1450820 100000 log

c000000001450820 b __log_buf

(gdb) x /i 0xc000000000005d00
   0xc000000000005d00 <instruction_access_common>:      andi.   r10,r12,16384
(qemu) xp /i 0x5d00
   0x0000000000005d00:  andi.   r10,r12,16384
(qemu) info tlb
   SLB    ESID                    VSID
   3      0xc000000008000000      0x0000c00838795000

So for some reason QEMU can still resolve the virtual address using the guest 
HTAB, but the the CPU can not. Otherwise the guest wouldn't get a 0x400 when 
accessing that page.

Keep in mind that the same machine with the same command flow runs VMs just 
fine in most cases. It also runs about 20 VMs in parallel, out of which this 
bug only manifests every other time.

Any idea what's going on here?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to