[email protected] (Philip Guenther), 2016.02.02 (Tue) 18:00 (CET):
> On Tue, 2 Feb 2016, Marcus MERIGHI wrote:
> > [email protected] (Stefan Kempf), 2016.02.01 (Mon) 19:13 (CET):
> > > Marcus MERIGHI wrote:
> > > > [email protected] (Stefan Kempf), 2016.01.30 (Sat) 10:49 (CET):
> > > > > We need to see how it looks like from within the kernel (and whether
> > > > > the illegal instruction is really raised from within sendsig()). Can 
> > > > > you
> > > > > try the diff below?
> > > > 
> > > > > You should get a kernel panic now instead of an illegal instruction
> > > > > signal if you try running ping or top. We need the output of the panic
> > > > > message and the output of the following commands:
> > > > 
> > > > ping(1), top(1) messed up the screen.
> > > > # ping 192.168.188.189                                                  
> > > > PING 192.168.188.189 (192.168.188.189): 56 data bytes
> > > > 64 bytes from 192.168.188.189: icmp_seq=0 ttl=255 time=166.533 ms
> > > > panic: sendsig 1: fxsave 0xffff800032c8a000, sp 0x7f7fff0d20b1,
> > > > fxave_size 512, savefpu_size 832, fpu_save_len 15773951, tf_rsp
> > > > 0x7f7ffffdd238, userstack 1
> > > 
> > > fpu_save_len is way too large (0xf0b0ff in hex). It should be 832 at
> > > most.  And that causes the kernel to attempt writes outside of the
> > > process stack (and/or to read beyond the saved FPU state).
> > > 
> > > Either the value we get from CPUID is strange (or we handle CPUID
> > > wrongly), or something trashes fpu_save_len.
> > 
> > Now that you mention CPUID...
> > If I switch 'Max CPUID Value Limit' to 'disabled' in the BIOS, the
> > symptom is gone. It re-appears when setting to 'enabled'. 
> 
> "Doctor, it hurts when I do this..."

And Dr. Guenther replies: Then... don't do it! 

> That BIOS option exists to support ancient OSes (Windows NT, etc) and 
> shouldn't be enabled when using OpenBSD.

I apologise for wasting everybody's time!

To find out whether this was a default 'enabled' setting I used BIOS
'load defaults settings' -> 'load optimized default? yes'
It wasn't default: Max CPUID Value Limit [Disabled]

To justify myself setting this to 'enabled'... 

The help on the BIOS option says: 'Disabled for Windows XP'. This in my
brain resolves to: 'This is a setting that must be disabled when running
windows xp'. 
Possibly I did an invalid inversion of the argument: "If it has to be
disabled for windows xp I'd better enable it for anything else."

Sorry, Marcus

> Currently we seem to assume that the presence of certain CPU features like 
> AVX implies that CPUID supports the related leaf; that BIOS option breaks 
> that assumption, resulting in the bogus fpu_save_len sizing you hit.  
> From the dmesg you posted I see it also explains the bogus mwait sizing 
> that has been reported by some others.  Your machine will perform better 
> with that option off; I guess we should add check to the code to catch 
> this sort of setup by checking the cpuid_level variable before using the 
> higher CPUID leafs.
> 
> Can you try applying the diff below, temporarily re-enable that BIOS 
> option, then report the resulting dmesg and verify that ping works 
> properly?
> 
> 
> Philip Guenther
> 
> 
> Index: i386/i386/cpu.c
> ===================================================================
> RCS file: /data/src/openbsd/src/sys/arch/i386/i386/cpu.c,v
> retrieving revision 1.70
> diff -u -p -r1.70 cpu.c
> --- i386/i386/cpu.c   27 Dec 2015 04:31:34 -0000      1.70
> +++ i386/i386/cpu.c   2 Feb 2016 16:54:09 -0000
> @@ -784,7 +784,7 @@ cpu_init_mwait(struct device *dv)
>  {
>       unsigned int smallest, largest, extensions, c_substates;
>  
> -     if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0)
> +     if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5)
>               return;
>  
>       /* get the monitor granularity */
> Index: amd64/amd64/cpu.c
> ===================================================================
> RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cpu.c,v
> retrieving revision 1.94
> diff -u -p -r1.94 cpu.c
> --- amd64/amd64/cpu.c 27 Dec 2015 04:31:34 -0000      1.94
> +++ amd64/amd64/cpu.c 2 Feb 2016 16:54:30 -0000
> @@ -282,7 +282,7 @@ cpu_init_mwait(struct cpu_softc *sc)
>  {
>       unsigned int smallest, largest, extensions, c_substates;
>  
> -     if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0)
> +     if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5)
>               return;
>  
>       /* get the monitor granularity */
> @@ -505,7 +505,7 @@ cpu_init(struct cpu_info *ci)
>               cr4 |= CR4_OSXSAVE;
>       lcr4(cr4);
>  
> -     if (cpu_ecxfeature & CPUIDECX_XSAVE) {
> +     if (cpu_ecxfeature & CPUIDECX_XSAVE && cpuid_level >= 0xd) {
>               u_int32_t eax, ebx, ecx, edx;
>  
>               xsave_mask = XCR0_X87 | XCR0_SSE;
> 
> 
> !DSPAM:56b0ecaf140281406920325!
> 

Reply via email to