There is no cpuid feature method of detecting MSR_TEMPERATURE_TARGET
presence.  So whatever is faking/masking the hardware should
either not advertise the thermal sensor cpuid bit or implement
MSR_TEMPERATURE_TARGET.  The other alternative is catching
the fault caused by doing an unsupported msr.

You should pick a different fake cpu if it can't provide
what a real core family processor implements.  Disabling
the cpu sensor everywhere as you propose is obviously not a solution.

On Thu, Nov 15, 2012 at 03:29:41PM +0200, Mikael wrote:
> (Emailing from this email address as to help you not publish my actual
> email address by mistake. I'm reachable on this email address the
> next months as it looks.)
> 
> 
> Hi,
> 
> On a QEMU/KVM 1.1.2-r2 machine with libvirt CPU setting <cpu
> mode='host-model'><model fallback='allow'/></cpu> running on a host
> "cpu0: Intel Core i7 9xx (Nehalem Class Core i7), 2800.70 Mhz", I get the
> following startup behavior from OpenBSD 5.2 from the
> newly installed system booted the first time (funnily enough the
> installation CD's kernel boots fine, perhaps it runs in a
> compat mode though because I don't find neither the nvram nor the mtrr row
> there, on first glance, anyhow):
> 
>      nvram: invalid checksum
>      mtrr: Pentium Pro MTRR support
>      kernel: protection fault trap, code=0
>      Stopped at     intelcore_update_sensor+0x17:     rdmsr
>      ddb>
> 
> 
> This bug is caused in
> http://fxr.watson.org/fxr/source/arch/amd64/amd64/identcpu.c?v=OPENBSD row
> 132.
> 
> It was introduced 2007 here
> http://marc.info/?l=openbsd-tech&m=118063274617707 .
> 
> The patch at the bottom of this file provides a fix, though who knows maybe
> it disables more things than necessary.
> 
> Please note that this bug kind of could be anticipated, given the comments
> at the bugging procedure:
> 
>        115 /*
>        116  * Temperature read on the CPU is relative to the maximum
>        117  * temperature supported by the CPU, Tj(Max).
>        118  * Poorly documented, refer to:
>        119  * http://softwarecommunity.intel.com/isn/Community/
>        120  * en-US/forums/thread/30228638.aspx
>      ..
>        124  */
> 
> 
> By this I conclude this bug reported to you. From reading Intel's specs of
> the RDMSR instruction, I do not get to clarity
> about whether this should be considered an OpenBSD bug (i.e. the fact that
> OpenBSD uses the instruction), or a QEMU/KVM bug
> (i.e. the fact that KVM causes segmentation fault on its invocation).
> 
> For a side note, there seems to be some Linux systems that run into this
> instruction and they report it in the syslog
> and do not crash.
> 
> Thank you!
> Mikael
> 
> 
> Refs:
> http://libvirt.org/formatdomain.html#elementsCPU
> http://code.metager.de/source/xref/libvirt/src/cpu/cpu_map.xml
> 
> 
>  --
> 
> Patch:
> 
> These are rows 125-147 of /source/arch/amd64/amd64/identcpu.c :
> 
> void
> intelcore_update_sensor(void *args)
> {
>         struct cpu_info *ci = (struct cpu_info *) args;
>         u_int64_t msr;
>         int max = 100;
> 
>         if (rdmsr(MSR_TEMPERATURE_TARGET) & MSR_TEMPERATURE_TARGET_LOW_BIT)
>                 max = 85;
> 
>         msr = rdmsr(MSR_THERM_STATUS);
>         if (msr & MSR_THERM_STATUS_VALID_BIT) {
>                 ci->ci_sensor.value = max - MSR_THERM_STATUS_TEMP(msr);
>                 /* micro degrees */
>                 ci->ci_sensor.value *= 1000000;
>                 /* kelvin */
>                 ci->ci_sensor.value += 273150000;
>                 ci->ci_sensor.flags &= ~SENSOR_FINVALID;
>         } else {
>                 ci->ci_sensor.value = 0;
>                 ci->ci_sensor.flags |= SENSOR_FINVALID;
>         }
> }
> 
> 
> 
> Change them to:
> 
> 
> void
> intelcore_update_sensor(void *args)
> {
>         struct cpu_info *ci = (struct cpu_info *) args;
>         // u_int64_t msr;     - as not to produce unused variable = error
>         // int max = 100;     - "
> 
>         // if (rdmsr(MSR_TEMPERATURE_TARGET) &
> MSR_TEMPERATURE_TARGET_LOW_BIT)
>         //         max = 85;
> 
>         // msr = rdmsr(MSR_THERM_STATUS);
>         // if (msr & MSR_THERM_STATUS_VALID_BIT) {
>         //         ci->ci_sensor.value = max - MSR_THERM_STATUS_TEMP(msr);
>         //         /* micro degrees */
>         //         ci->ci_sensor.value *= 1000000;
>         //         /* kelvin */
>         //         ci->ci_sensor.value += 273150000;
>         //         ci->ci_sensor.flags &= ~SENSOR_FINVALID;
>         // } else {
>                 ci->ci_sensor.value = 0;
>                 ci->ci_sensor.flags |= SENSOR_FINVALID;
>         // }
> }
> 
> 
> 
> 
> 
> 
> 
> When running with workaround-patched kernel:
> 
> $ cat /var/run/dmesg.boot
> OpenBSD 5.2-current (GENERIC.MP) #2: Thu Nov 15 13:05:13 CET 2012
>     [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 1072680960 (1022MB)
> avail mem = 1021698048 (974MB)
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0x3ffff980 (41 entries)
> bios0: vendor Bochs version "Bochs" date 01/01/2007
> bios0: Bochs Bochs
> acpi0 at bios0: rev 0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP SSDT APIC HPET SSDT
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> acpihpet0 at acpi0: 100000000 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> mpbios0 at bios0: Intel MP Specification 1.4
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel Core i7 9xx (Nehalem Class Core i7), 2963.42 MHz
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF
> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
> 64b/line 16-way L2 cache
> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: apic clock running at 1479MHz
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel Core i7 9xx (Nehalem Class Core i7), 4566.49 MHz
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF
> cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
> 64b/line 16-way L2 cache
> cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> mpbios0: bus 0 is type PCI
> mpbios0: bus 1 is type ISA
> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 11, 24 pins
> ioapic0: misconfigured as apic 0, remapped to apid 2
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
> pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
> pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel
> 0 wired to compatibility, channel 1 wired to compatibility
> wd0 at pciide0 channel 0 drive 0: <QEMU HARDDISK>
> wd0: 16-sector PIO, LBA48, 51200MB, 104857600 sectors
> atapiscsi0 at pciide0 channel 0 drive 1
> scsibus0 at atapiscsi0: 2 targets
> cd0 at scsibus0 targ 0 lun 0: <QEMU, QEMU DVD-ROM, 1.1.> ATAPI 5/cdrom
> removable
> wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
> cd0(pciide0:0:1): using PIO mode 4, DMA mode 2
> pciide0: channel 1 disabled (no drives)
> uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 2 int 11
> piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic 2 int
> 9
> iic0 at piixpm0
> iic0: addr 0x4c 48=00 words 00=0000 01=0000 02=0000 03=0000 04=0000 05=0000
> 06=0000 07=0000
> iic0: addr 0x4e 48=00 words 00=0000 01=0000 02=0000 03=0000 04=0000 05=0000
> 06=0000 07=0000
> vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> re0 at pci0 dev 3 function 0 "Realtek 8139" rev 0x20: RTL8139C+ (0x7480),
> apic 2 int 11, address 1c:0f:7e:70:0d:5e
> rlphy0 at re0 phy 0: RTL internal PHY
> virtio0 at pci0 dev 4 function 0 "Qumranet Virtio Memory" rev 0x00: Virtio
> Memory Balloon Device
> virtio0: no matching child driver; not configured
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pms0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pms0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> fd0 at fdc0 drive 0: density unknown
> fd1 at fdc0 drive 1: density unknown
> usb0 at uhci0: USB revision 1.0
> uhub0 at usb0 "Intel UHCI root hub" rev 1.00/1.00 addr 1
> nvram: invalid checksum
> mtrr: Pentium Pro MTRR support
> vscsi0 at root
> scsibus1 at vscsi0: 256 targets
> softraid0 at root
> scsibus2 at softraid0: 256 targets
> root on wd0a (1b840e0498b5beb4.a) swap on wd0b dump on wd0b
> clock: unknown CMOS layout
> softraid0: incorrect key or passphrase
> sd0 at scsibus2 targ 1 lun 0: <OPENBSD, SR CRYPTO, 005> SCSI2 0/direct fixed
> sd0: 49920MB, 512 bytes/sector, 102237132 sectors

Reply via email to