[email protected] (Philip Guenther), 2016.02.02 (Tue) 18:00 (CET):
> On Tue, 2 Feb 2016, Marcus MERIGHI wrote:
> > [email protected] (Stefan Kempf), 2016.02.01 (Mon) 19:13 (CET):
> > > Marcus MERIGHI wrote:
> > > > [email protected] (Stefan Kempf), 2016.01.30 (Sat) 10:49 (CET):
> > > > > We need to see how it looks like from within the kernel (and whether
> > > > > the illegal instruction is really raised from within sendsig()). Can
> > > > > you
> > > > > try the diff below?
> > > >
> > > > > You should get a kernel panic now instead of an illegal instruction
> > > > > signal if you try running ping or top. We need the output of the panic
> > > > > message and the output of the following commands:
> > > >
> > > > ping(1), top(1) messed up the screen.
> > > > # ping 192.168.188.189
> > > > PING 192.168.188.189 (192.168.188.189): 56 data bytes
> > > > 64 bytes from 192.168.188.189: icmp_seq=0 ttl=255 time=166.533 ms
> > > > panic: sendsig 1: fxsave 0xffff800032c8a000, sp 0x7f7fff0d20b1,
> > > > fxave_size 512, savefpu_size 832, fpu_save_len 15773951, tf_rsp
> > > > 0x7f7ffffdd238, userstack 1
> > >
> > > fpu_save_len is way too large (0xf0b0ff in hex). It should be 832 at
> > > most. And that causes the kernel to attempt writes outside of the
> > > process stack (and/or to read beyond the saved FPU state).
> > >
> > > Either the value we get from CPUID is strange (or we handle CPUID
> > > wrongly), or something trashes fpu_save_len.
> >
> > Now that you mention CPUID...
> > If I switch 'Max CPUID Value Limit' to 'disabled' in the BIOS, the
> > symptom is gone. It re-appears when setting to 'enabled'.
>
> "Doctor, it hurts when I do this..."
And Dr. Guenther replies: Then... don't do it!
> That BIOS option exists to support ancient OSes (Windows NT, etc) and
> shouldn't be enabled when using OpenBSD.
I apologise for wasting everybody's time!
To find out whether this was a default 'enabled' setting I used BIOS
'load defaults settings' -> 'load optimized default? yes'
It wasn't default: Max CPUID Value Limit [Disabled]
To justify myself setting this to 'enabled'...
The help on the BIOS option says: 'Disabled for Windows XP'. This in my
brain resolves to: 'This is a setting that must be disabled when running
windows xp'.
Possibly I did an invalid inversion of the argument: "If it has to be
disabled for windows xp I'd better enable it for anything else."
Sorry, Marcus
> Currently we seem to assume that the presence of certain CPU features like
> AVX implies that CPUID supports the related leaf; that BIOS option breaks
> that assumption, resulting in the bogus fpu_save_len sizing you hit.
> From the dmesg you posted I see it also explains the bogus mwait sizing
> that has been reported by some others. Your machine will perform better
> with that option off; I guess we should add check to the code to catch
> this sort of setup by checking the cpuid_level variable before using the
> higher CPUID leafs.
>
> Can you try applying the diff below, temporarily re-enable that BIOS
> option, then report the resulting dmesg and verify that ping works
> properly?
>
>
> Philip Guenther
>
>
> Index: i386/i386/cpu.c
> ===================================================================
> RCS file: /data/src/openbsd/src/sys/arch/i386/i386/cpu.c,v
> retrieving revision 1.70
> diff -u -p -r1.70 cpu.c
> --- i386/i386/cpu.c 27 Dec 2015 04:31:34 -0000 1.70
> +++ i386/i386/cpu.c 2 Feb 2016 16:54:09 -0000
> @@ -784,7 +784,7 @@ cpu_init_mwait(struct device *dv)
> {
> unsigned int smallest, largest, extensions, c_substates;
>
> - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0)
> + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5)
> return;
>
> /* get the monitor granularity */
> Index: amd64/amd64/cpu.c
> ===================================================================
> RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cpu.c,v
> retrieving revision 1.94
> diff -u -p -r1.94 cpu.c
> --- amd64/amd64/cpu.c 27 Dec 2015 04:31:34 -0000 1.94
> +++ amd64/amd64/cpu.c 2 Feb 2016 16:54:30 -0000
> @@ -282,7 +282,7 @@ cpu_init_mwait(struct cpu_softc *sc)
> {
> unsigned int smallest, largest, extensions, c_substates;
>
> - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0)
> + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5)
> return;
>
> /* get the monitor granularity */
> @@ -505,7 +505,7 @@ cpu_init(struct cpu_info *ci)
> cr4 |= CR4_OSXSAVE;
> lcr4(cr4);
>
> - if (cpu_ecxfeature & CPUIDECX_XSAVE) {
> + if (cpu_ecxfeature & CPUIDECX_XSAVE && cpuid_level >= 0xd) {
> u_int32_t eax, ebx, ecx, edx;
>
> xsave_mask = XCR0_X87 | XCR0_SSE;
>
>
> !DSPAM:56b0ecaf140281406920325!
>