Re: kernel panic caused by virtualbox(?)

2016-08-12 Thread Konstantin Belousov
On Thu, Aug 11, 2016 at 03:22:44PM -0700, Don Lewis wrote:
> On 11 Aug, Konstantin Belousov wrote:
> > On Wed, Aug 10, 2016 at 04:47:15PM -0700, Don Lewis wrote:
> >> On 10 Aug, Jung-uk Kim wrote:
> >> > On 08/09/16 05:12 AM, Konstantin Belousov wrote:
> >> >> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
> >> >>> On  8 Aug, Konstantin Belousov wrote:
> >>  On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> >> > On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> >> >> Reposted to -current to get some more eyes on this ...
> >> >>
> >> >> I just got a kernel panic when I started up a CentOS 7 VM in 
> >> >> virtualbox.
> >> >> The host is:
> >> >> FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> >> >> The virtualbox version is:
> >> >> virtualbox-ose-5.0.26
> >> >> virtualbox-ose-kmod-5.0.26_1
> >> >>
> >> >> The panic message is:
> >> >>
> >> >> panic: Unregistered use of FPU in kernel
> >> >> cpuid = 1
> >> >> KDB: stack backtrace:
> >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> >> >> 0xfe085a55d030
> >> >> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> >> >> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> >> >> trap() at trap+0x7ae/frame 0xfe085a55d330
> >> >> calltrap() at calltrap+0x8/frame 0xfe085a55d330
> >> >> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, 
> >> >> rbp = 0xfe085a55d430 ---
> >> >> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> >> >> g_pLogger() at 0x8274e5c7/frame 0x3
> >> >> KDB: enter: panic
> >> >>
> >> >> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox 
> >> >> is
> >> >> the trigger.
> >> >>
> >> >> There are no symbols for the virtualbox kmods, possibly because I
> >> >> installed them as an upgrade using packages (built with the same 
> >> >> source
> >> >> tree version) instead of by using PORTS_MODULES in make.conf, so 
> >> >> ports
> >> >> kgdb didn't have anything useful to say about what happened before 
> >> >> the
> >> >> trap.
> >> >>
> >> >> This panic is very repeatable.  I just got another one when 
> >> >> starting the
> >> >> same VM., but this time the two calls before the trap were
> >> >> null_bug_bypass().  Hmn, that symbol is in nullfs ...
> >> >>
> >> >> I don't see this with a Windows 7 VM.
> >> >>
> >> >> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> >> >> -msoft-float -mno-aes -mno-avx
> >>  Your disassemble listed fxrstor instruction that failing, or did I
> >>  mis-remembered ? This is most likely some context switch code, either
> >>  by virtual machine or erronously executed guest code. It is not a
> >>  spontaneous use of FPU, but more likely something different. Can you
> >>  confirm ?
> >> 
> >>  In either case, I do not remember any KBI changes around PCB layout or
> >>  fpu_enter() KPI recently.
> >> 
> >> >
> >> > I suspect head packages are quite likely built against the a "wrong" 
> >> > KBI
> >> > and are too fragile to use for kmods vs compiling from ports. :-/  I 
> >> > would
> >> > try a built-from-ports kmod to see if the panics go away.
> >> 
> >>  FWIW, I will commit the following change shortly. Since third-party
> >>  modules break the invariant, either due to bugs (ndis wrappers) or
> >>  possibly due to KBI breakage, it is worth to have the detection 
> >>  enabled
> >>  for production kernels.
> >> >>>
> >> >>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
> >> >>> GENERIC kernel and the guest seemed to operate properly.  Then I 
> >> >>> enabled
> >> >>> INVARIANTS and got the panic.  I suspect that is why nobody has 
> >> >>> stumbled
> >> >>> across this before.
> >> >>>
> >> >> This is yet another reason to promote KASSERT to the full panic.
> >> >> I expect that the vbox source lacks fpu_kern_enter() calls around the
> >> >> FPU state restoration.
> >> > 
> >> > Unfortunately, the code is in MI source as it is unnecessary for
> >> > supported OSes (read: FreeBSD is not supported) and it's not easy to
> >> > inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(
> >> 
> >> It's a headache, but our ports can use patch files for that sort of
> >> thing ...
> > 
> > Note that it is, most likely, completely useless to wrap single
> > FXRSTOR instruction into the fpu_kern_enter() braces.  The purpose of
> > the instruction is to load ('legacy', as they call it, no AVX+) FPU state
> > into the machine context.  If you put fpu_kern_leave() right after
> > the instruction, the context is flushed.
> 
> Since it looks like the code is preparing to re-enter the guest, then
> calling fpu_kern_leave() doesn't make sense.
> 
> > There 

Re: kernel panic caused by virtualbox(?)

2016-08-11 Thread Don Lewis
On 11 Aug, Konstantin Belousov wrote:
> On Wed, Aug 10, 2016 at 04:47:15PM -0700, Don Lewis wrote:
>> On 10 Aug, Jung-uk Kim wrote:
>> > On 08/09/16 05:12 AM, Konstantin Belousov wrote:
>> >> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
>> >>> On  8 Aug, Konstantin Belousov wrote:
>>  On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
>> > On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>> >> Reposted to -current to get some more eyes on this ...
>> >>
>> >> I just got a kernel panic when I started up a CentOS 7 VM in 
>> >> virtualbox.
>> >> The host is:
>> >>   FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>> >> The virtualbox version is:
>> >>   virtualbox-ose-5.0.26
>> >>   virtualbox-ose-kmod-5.0.26_1
>> >>
>> >> The panic message is:
>> >>
>> >> panic: Unregistered use of FPU in kernel
>> >> cpuid = 1
>> >> KDB: stack backtrace:
>> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> >> 0xfe085a55d030
>> >> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
>> >> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
>> >> trap() at trap+0x7ae/frame 0xfe085a55d330
>> >> calltrap() at calltrap+0x8/frame 0xfe085a55d330
>> >> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, 
>> >> rbp = 0xfe085a55d430 ---
>> >> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
>> >> g_pLogger() at 0x8274e5c7/frame 0x3
>> >> KDB: enter: panic
>> >>
>> >> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>> >> the trigger.
>> >>
>> >> There are no symbols for the virtualbox kmods, possibly because I
>> >> installed them as an upgrade using packages (built with the same 
>> >> source
>> >> tree version) instead of by using PORTS_MODULES in make.conf, so ports
>> >> kgdb didn't have anything useful to say about what happened before the
>> >> trap.
>> >>
>> >> This panic is very repeatable.  I just got another one when starting 
>> >> the
>> >> same VM., but this time the two calls before the trap were
>> >> null_bug_bypass().  Hmn, that symbol is in nullfs ...
>> >>
>> >> I don't see this with a Windows 7 VM.
>> >>
>> >> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>> >> -msoft-float -mno-aes -mno-avx
>>  Your disassemble listed fxrstor instruction that failing, or did I
>>  mis-remembered ? This is most likely some context switch code, either
>>  by virtual machine or erronously executed guest code. It is not a
>>  spontaneous use of FPU, but more likely something different. Can you
>>  confirm ?
>> 
>>  In either case, I do not remember any KBI changes around PCB layout or
>>  fpu_enter() KPI recently.
>> 
>> >
>> > I suspect head packages are quite likely built against the a "wrong" 
>> > KBI
>> > and are too fragile to use for kmods vs compiling from ports. :-/  I 
>> > would
>> > try a built-from-ports kmod to see if the panics go away.
>> 
>>  FWIW, I will commit the following change shortly. Since third-party
>>  modules break the invariant, either due to bugs (ndis wrappers) or
>>  possibly due to KBI breakage, it is worth to have the detection enabled
>>  for production kernels.
>> >>>
>> >>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
>> >>> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
>> >>> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
>> >>> across this before.
>> >>>
>> >> This is yet another reason to promote KASSERT to the full panic.
>> >> I expect that the vbox source lacks fpu_kern_enter() calls around the
>> >> FPU state restoration.
>> > 
>> > Unfortunately, the code is in MI source as it is unnecessary for
>> > supported OSes (read: FreeBSD is not supported) and it's not easy to
>> > inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(
>> 
>> It's a headache, but our ports can use patch files for that sort of
>> thing ...
> 
> Note that it is, most likely, completely useless to wrap single
> FXRSTOR instruction into the fpu_kern_enter() braces.  The purpose of
> the instruction is to load ('legacy', as they call it, no AVX+) FPU state
> into the machine context.  If you put fpu_kern_leave() right after
> the instruction, the context is flushed.

Since it looks like the code is preparing to re-enter the guest, then
calling fpu_kern_leave() doesn't make sense.

> There must be some larger scope where the braces do make sense.  And since
> some other OSes do require similar precautions around the in-kernel FPU
> access, I suspect that there should be some common place to put our KPI
> calls.

CPUMSetGuestXcr0() is the first stack frame.  It wouldn't seem to make
sense to call 

Re: kernel panic caused by virtualbox(?)

2016-08-11 Thread Konstantin Belousov
On Wed, Aug 10, 2016 at 04:47:15PM -0700, Don Lewis wrote:
> On 10 Aug, Jung-uk Kim wrote:
> > On 08/09/16 05:12 AM, Konstantin Belousov wrote:
> >> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
> >>> On  8 Aug, Konstantin Belousov wrote:
>  On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> > On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> >> Reposted to -current to get some more eyes on this ...
> >>
> >> I just got a kernel panic when I started up a CentOS 7 VM in 
> >> virtualbox.
> >> The host is:
> >>FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> >> The virtualbox version is:
> >>virtualbox-ose-5.0.26
> >>virtualbox-ose-kmod-5.0.26_1
> >>
> >> The panic message is:
> >>
> >> panic: Unregistered use of FPU in kernel
> >> cpuid = 1
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> >> 0xfe085a55d030
> >> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> >> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> >> trap() at trap+0x7ae/frame 0xfe085a55d330
> >> calltrap() at calltrap+0x8/frame 0xfe085a55d330
> >> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp 
> >> = 0xfe085a55d430 ---
> >> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> >> g_pLogger() at 0x8274e5c7/frame 0x3
> >> KDB: enter: panic
> >>
> >> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> >> the trigger.
> >>
> >> There are no symbols for the virtualbox kmods, possibly because I
> >> installed them as an upgrade using packages (built with the same source
> >> tree version) instead of by using PORTS_MODULES in make.conf, so ports
> >> kgdb didn't have anything useful to say about what happened before the
> >> trap.
> >>
> >> This panic is very repeatable.  I just got another one when starting 
> >> the
> >> same VM., but this time the two calls before the trap were
> >> null_bug_bypass().  Hmn, that symbol is in nullfs ...
> >>
> >> I don't see this with a Windows 7 VM.
> >>
> >> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> >> -msoft-float -mno-aes -mno-avx
>  Your disassemble listed fxrstor instruction that failing, or did I
>  mis-remembered ? This is most likely some context switch code, either
>  by virtual machine or erronously executed guest code. It is not a
>  spontaneous use of FPU, but more likely something different. Can you
>  confirm ?
> 
>  In either case, I do not remember any KBI changes around PCB layout or
>  fpu_enter() KPI recently.
> 
> >
> > I suspect head packages are quite likely built against the a "wrong" KBI
> > and are too fragile to use for kmods vs compiling from ports. :-/  I 
> > would
> > try a built-from-ports kmod to see if the panics go away.
> 
>  FWIW, I will commit the following change shortly. Since third-party
>  modules break the invariant, either due to bugs (ndis wrappers) or
>  possibly due to KBI breakage, it is worth to have the detection enabled
>  for production kernels.
> >>>
> >>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
> >>> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
> >>> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
> >>> across this before.
> >>>
> >> This is yet another reason to promote KASSERT to the full panic.
> >> I expect that the vbox source lacks fpu_kern_enter() calls around the
> >> FPU state restoration.
> > 
> > Unfortunately, the code is in MI source as it is unnecessary for
> > supported OSes (read: FreeBSD is not supported) and it's not easy to
> > inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(
> 
> It's a headache, but our ports can use patch files for that sort of
> thing ...

Note that it is, most likely, completely useless to wrap single
FXRSTOR instruction into the fpu_kern_enter() braces.  The purpose of
the instruction is to load ('legacy', as they call it, no AVX+) FPU state
into the machine context.  If you put fpu_kern_leave() right after
the instruction, the context is flushed.

There must be some larger scope where the braces do make sense.  And since
some other OSes do require similar precautions around the in-kernel FPU
access, I suspect that there should be some common place to put our KPI
calls.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-10 Thread Don Lewis
On 10 Aug, Jung-uk Kim wrote:
> On 08/09/16 05:12 AM, Konstantin Belousov wrote:
>> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
>>> On  8 Aug, Konstantin Belousov wrote:
 On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>> Reposted to -current to get some more eyes on this ...
>>
>> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
>> The host is:
>>  FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>> The virtualbox version is:
>>  virtualbox-ose-5.0.26
>>  virtualbox-ose-kmod-5.0.26_1
>>
>> The panic message is:
>>
>> panic: Unregistered use of FPU in kernel
>> cpuid = 1
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> 0xfe085a55d030
>> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
>> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
>> trap() at trap+0x7ae/frame 0xfe085a55d330
>> calltrap() at calltrap+0x8/frame 0xfe085a55d330
>> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
>> 0xfe085a55d430 ---
>> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
>> g_pLogger() at 0x8274e5c7/frame 0x3
>> KDB: enter: panic
>>
>> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>> the trigger.
>>
>> There are no symbols for the virtualbox kmods, possibly because I
>> installed them as an upgrade using packages (built with the same source
>> tree version) instead of by using PORTS_MODULES in make.conf, so ports
>> kgdb didn't have anything useful to say about what happened before the
>> trap.
>>
>> This panic is very repeatable.  I just got another one when starting the
>> same VM., but this time the two calls before the trap were
>> null_bug_bypass().  Hmn, that symbol is in nullfs ...
>>
>> I don't see this with a Windows 7 VM.
>>
>> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>> -msoft-float -mno-aes -mno-avx
 Your disassemble listed fxrstor instruction that failing, or did I
 mis-remembered ? This is most likely some context switch code, either
 by virtual machine or erronously executed guest code. It is not a
 spontaneous use of FPU, but more likely something different. Can you
 confirm ?

 In either case, I do not remember any KBI changes around PCB layout or
 fpu_enter() KPI recently.

>
> I suspect head packages are quite likely built against the a "wrong" KBI
> and are too fragile to use for kmods vs compiling from ports. :-/  I would
> try a built-from-ports kmod to see if the panics go away.

 FWIW, I will commit the following change shortly. Since third-party
 modules break the invariant, either due to bugs (ndis wrappers) or
 possibly due to KBI breakage, it is worth to have the detection enabled
 for production kernels.
>>>
>>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
>>> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
>>> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
>>> across this before.
>>>
>> This is yet another reason to promote KASSERT to the full panic.
>> I expect that the vbox source lacks fpu_kern_enter() calls around the
>> FPU state restoration.
> 
> Unfortunately, the code is in MI source as it is unnecessary for
> supported OSes (read: FreeBSD is not supported) and it's not easy to
> inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(

It's a headache, but our ports can use patch files for that sort of
thing ...

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-10 Thread Jung-uk Kim
On 08/09/16 05:12 AM, Konstantin Belousov wrote:
> On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
>> On  8 Aug, Konstantin Belousov wrote:
>>> On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
 On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> Reposted to -current to get some more eyes on this ...
>
> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
> The host is:
>   FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> The virtualbox version is:
>   virtualbox-ose-5.0.26
>   virtualbox-ose-kmod-5.0.26_1
>
> The panic message is:
>
> panic: Unregistered use of FPU in kernel
> cpuid = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> 0xfe085a55d030
> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> trap() at trap+0x7ae/frame 0xfe085a55d330
> calltrap() at calltrap+0x8/frame 0xfe085a55d330
> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
> 0xfe085a55d430 ---
> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> g_pLogger() at 0x8274e5c7/frame 0x3
> KDB: enter: panic
>
> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> the trigger.
>
> There are no symbols for the virtualbox kmods, possibly because I
> installed them as an upgrade using packages (built with the same source
> tree version) instead of by using PORTS_MODULES in make.conf, so ports
> kgdb didn't have anything useful to say about what happened before the
> trap.
>
> This panic is very repeatable.  I just got another one when starting the
> same VM., but this time the two calls before the trap were
> null_bug_bypass().  Hmn, that symbol is in nullfs ...
>
> I don't see this with a Windows 7 VM.
>
> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> -msoft-float -mno-aes -mno-avx
>>> Your disassemble listed fxrstor instruction that failing, or did I
>>> mis-remembered ? This is most likely some context switch code, either
>>> by virtual machine or erronously executed guest code. It is not a
>>> spontaneous use of FPU, but more likely something different. Can you
>>> confirm ?
>>>
>>> In either case, I do not remember any KBI changes around PCB layout or
>>> fpu_enter() KPI recently.
>>>

 I suspect head packages are quite likely built against the a "wrong" KBI
 and are too fragile to use for kmods vs compiling from ports. :-/  I would
 try a built-from-ports kmod to see if the panics go away.
>>>
>>> FWIW, I will commit the following change shortly. Since third-party
>>> modules break the invariant, either due to bugs (ndis wrappers) or
>>> possibly due to KBI breakage, it is worth to have the detection enabled
>>> for production kernels.
>>
>> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
>> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
>> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
>> across this before.
>>
> This is yet another reason to promote KASSERT to the full panic.
> I expect that the vbox source lacks fpu_kern_enter() calls around the
> FPU state restoration.

Unfortunately, the code is in MI source as it is unnecessary for
supported OSes (read: FreeBSD is not supported) and it's not easy to
inject fpu_kern_enter()/fpu_kern_leave() calls there. :-(

Jung-uk Kim



signature.asc
Description: OpenPGP digital signature


Re: kernel panic caused by virtualbox(?)

2016-08-09 Thread Konstantin Belousov
On Mon, Aug 08, 2016 at 04:44:20PM -0700, Don Lewis wrote:
> On  8 Aug, Konstantin Belousov wrote:
> > On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> >> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> >> > Reposted to -current to get some more eyes on this ...
> >> > 
> >> > I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
> >> > The host is:
> >> >  FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> >> > The virtualbox version is:
> >> >  virtualbox-ose-5.0.26
> >> >  virtualbox-ose-kmod-5.0.26_1
> >> > 
> >> > The panic message is:
> >> > 
> >> > panic: Unregistered use of FPU in kernel
> >> > cpuid = 1
> >> > KDB: stack backtrace:
> >> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> >> > 0xfe085a55d030
> >> > vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> >> > kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> >> > trap() at trap+0x7ae/frame 0xfe085a55d330
> >> > calltrap() at calltrap+0x8/frame 0xfe085a55d330
> >> > --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
> >> > 0xfe085a55d430 ---
> >> > g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> >> > g_pLogger() at 0x8274e5c7/frame 0x3
> >> > KDB: enter: panic
> >> > 
> >> > Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> >> > the trigger.
> >> > 
> >> > There are no symbols for the virtualbox kmods, possibly because I
> >> > installed them as an upgrade using packages (built with the same source
> >> > tree version) instead of by using PORTS_MODULES in make.conf, so ports
> >> > kgdb didn't have anything useful to say about what happened before the
> >> > trap.
> >> > 
> >> > This panic is very repeatable.  I just got another one when starting the
> >> > same VM., but this time the two calls before the trap were
> >> > null_bug_bypass().  Hmn, that symbol is in nullfs ...
> >> > 
> >> > I don't see this with a Windows 7 VM.
> >> > 
> >> > All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> >> > -msoft-float -mno-aes -mno-avx
> > Your disassemble listed fxrstor instruction that failing, or did I
> > mis-remembered ? This is most likely some context switch code, either
> > by virtual machine or erronously executed guest code. It is not a
> > spontaneous use of FPU, but more likely something different. Can you
> > confirm ?
> > 
> > In either case, I do not remember any KBI changes around PCB layout or
> > fpu_enter() KPI recently.
> > 
> >> 
> >> I suspect head packages are quite likely built against the a "wrong" KBI
> >> and are too fragile to use for kmods vs compiling from ports. :-/  I would
> >> try a built-from-ports kmod to see if the panics go away.
> > 
> > FWIW, I will commit the following change shortly. Since third-party
> > modules break the invariant, either due to bugs (ndis wrappers) or
> > possibly due to KBI breakage, it is worth to have the detection enabled
> > for production kernels.
> 
> Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
> GENERIC kernel and the guest seemed to operate properly.  Then I enabled
> INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
> across this before.
> 
This is yet another reason to promote KASSERT to the full panic.
I expect that the vbox source lacks fpu_kern_enter() calls around the
FPU state restoration.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-08 Thread Don Lewis
On  8 Aug, Konstantin Belousov wrote:
> On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
>> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>> > Reposted to -current to get some more eyes on this ...
>> > 
>> > I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
>> > The host is:
>> >FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>> > The virtualbox version is:
>> >virtualbox-ose-5.0.26
>> >virtualbox-ose-kmod-5.0.26_1
>> > 
>> > The panic message is:
>> > 
>> > panic: Unregistered use of FPU in kernel
>> > cpuid = 1
>> > KDB: stack backtrace:
>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> > 0xfe085a55d030
>> > vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
>> > kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
>> > trap() at trap+0x7ae/frame 0xfe085a55d330
>> > calltrap() at calltrap+0x8/frame 0xfe085a55d330
>> > --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
>> > 0xfe085a55d430 ---
>> > g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
>> > g_pLogger() at 0x8274e5c7/frame 0x3
>> > KDB: enter: panic
>> > 
>> > Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>> > the trigger.
>> > 
>> > There are no symbols for the virtualbox kmods, possibly because I
>> > installed them as an upgrade using packages (built with the same source
>> > tree version) instead of by using PORTS_MODULES in make.conf, so ports
>> > kgdb didn't have anything useful to say about what happened before the
>> > trap.
>> > 
>> > This panic is very repeatable.  I just got another one when starting the
>> > same VM., but this time the two calls before the trap were
>> > null_bug_bypass().  Hmn, that symbol is in nullfs ...
>> > 
>> > I don't see this with a Windows 7 VM.
>> > 
>> > All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>> > -msoft-float -mno-aes -mno-avx
> Your disassemble listed fxrstor instruction that failing, or did I
> mis-remembered ? This is most likely some context switch code, either
> by virtual machine or erronously executed guest code. It is not a
> spontaneous use of FPU, but more likely something different. Can you
> confirm ?
> 
> In either case, I do not remember any KBI changes around PCB layout or
> fpu_enter() KPI recently.
> 
>> 
>> I suspect head packages are quite likely built against the a "wrong" KBI
>> and are too fragile to use for kmods vs compiling from ports. :-/  I would
>> try a built-from-ports kmod to see if the panics go away.
> 
> FWIW, I will commit the following change shortly. Since third-party
> modules break the invariant, either due to bugs (ndis wrappers) or
> possibly due to KBI breakage, it is worth to have the detection enabled
> for production kernels.

Interesting ... I tried running virtualbox on recent 10.3-STABLE with a
GENERIC kernel and the guest seemed to operate properly.  Then I enabled
INVARIANTS and got the panic.  I suspect that is why nobody has stumbled
across this before.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-08 Thread Don Lewis
On  8 Aug, Konstantin Belousov wrote:
> On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
>> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>> > Reposted to -current to get some more eyes on this ...
>> > 
>> > I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
>> > The host is:
>> >FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>> > The virtualbox version is:
>> >virtualbox-ose-5.0.26
>> >virtualbox-ose-kmod-5.0.26_1
>> > 
>> > The panic message is:
>> > 
>> > panic: Unregistered use of FPU in kernel
>> > cpuid = 1
>> > KDB: stack backtrace:
>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> > 0xfe085a55d030
>> > vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
>> > kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
>> > trap() at trap+0x7ae/frame 0xfe085a55d330
>> > calltrap() at calltrap+0x8/frame 0xfe085a55d330
>> > --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
>> > 0xfe085a55d430 ---
>> > g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
>> > g_pLogger() at 0x8274e5c7/frame 0x3
>> > KDB: enter: panic
>> > 
>> > Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>> > the trigger.
>> > 
>> > There are no symbols for the virtualbox kmods, possibly because I
>> > installed them as an upgrade using packages (built with the same source
>> > tree version) instead of by using PORTS_MODULES in make.conf, so ports
>> > kgdb didn't have anything useful to say about what happened before the
>> > trap.
>> > 
>> > This panic is very repeatable.  I just got another one when starting the
>> > same VM., but this time the two calls before the trap were
>> > null_bug_bypass().  Hmn, that symbol is in nullfs ...
>> > 
>> > I don't see this with a Windows 7 VM.
>> > 
>> > All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>> > -msoft-float -mno-aes -mno-avx
> Your disassemble listed fxrstor instruction that failing, or did I
> mis-remembered ? This is most likely some context switch code, either
> by virtual machine or erronously executed guest code. It is not a
> spontaneous use of FPU, but more likely something different. Can you
> confirm ?

That is correct.  My first guess was that it was some guest code
incorrectly getting executed in the host context, but when I
disassembled segments of the code for the two frames leading up to the
trap, I noticed that there is a call from the first to the second, which
would only work if the code was located at these locations in the host
KVA space.  In the guest address space the code locations would likely
be totally different and the call would not work.

My next best guess is that this is some sort of trampoline code that
virtualbox stashed in some allocated memory.

As I mentioned previously, I don't have this problem if I restrict the
guest to one processor.

Another observation is that Linux guests now default to KVM
paravirtualization, whereas the other guests use other methods.  One
experiment that I will try is setting this to "none", but my
12.0-CURRENT box will be busy running poudriere for about the next 12
hours.

I also hope to try this on FreeBSD 10.3-STABLE in the next couple of
hours.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-08 Thread Don Lewis
On  8 Aug, John Baldwin wrote:
> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
>> Reposted to -current to get some more eyes on this ...
>> 
>> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
>> The host is:
>>  FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
>> The virtualbox version is:
>>  virtualbox-ose-5.0.26
>>  virtualbox-ose-kmod-5.0.26_1
>> 
>> The panic message is:
>> 
>> panic: Unregistered use of FPU in kernel
>> cpuid = 1
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> 0xfe085a55d030
>> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
>> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
>> trap() at trap+0x7ae/frame 0xfe085a55d330
>> calltrap() at calltrap+0x8/frame 0xfe085a55d330
>> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
>> 0xfe085a55d430 ---
>> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
>> g_pLogger() at 0x8274e5c7/frame 0x3
>> KDB: enter: panic
>> 
>> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
>> the trigger.
>> 
>> There are no symbols for the virtualbox kmods, possibly because I
>> installed them as an upgrade using packages (built with the same source
>> tree version) instead of by using PORTS_MODULES in make.conf, so ports
>> kgdb didn't have anything useful to say about what happened before the
>> trap.
>> 
>> This panic is very repeatable.  I just got another one when starting the
>> same VM., but this time the two calls before the trap were
>> null_bug_bypass().  Hmn, that symbol is in nullfs ...
>> 
>> I don't see this with a Windows 7 VM.
>> 
>> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
>> -msoft-float -mno-aes -mno-avx
> 
> I suspect head packages are quite likely built against the a "wrong" KBI
> and are too fragile to use for kmods vs compiling from ports. :-/  I would
> try a built-from-ports kmod to see if the panics go away.

The poudriere jail that I used to build the package was the same source
version as the host.

I updated the host to r303762 and manually rebuilt and installed the
kmod from ports and still see the panic.  What is interesting is that if
I configure the Linux guests to use only one processor, the panic goes
away.

See  for the
latest info.

I just figured out a way to test this on recent 10.3-STABLE without
endangering my desktop machine.  I should have some results in a couple
hours after I bring the test machine up to date.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-08 Thread Konstantin Belousov
On Mon, Aug 08, 2016 at 10:22:44AM -0700, John Baldwin wrote:
> On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> > Reposted to -current to get some more eyes on this ...
> > 
> > I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
> > The host is:
> > FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> > The virtualbox version is:
> > virtualbox-ose-5.0.26
> > virtualbox-ose-kmod-5.0.26_1
> > 
> > The panic message is:
> > 
> > panic: Unregistered use of FPU in kernel
> > cpuid = 1
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> > 0xfe085a55d030
> > vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> > kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> > trap() at trap+0x7ae/frame 0xfe085a55d330
> > calltrap() at calltrap+0x8/frame 0xfe085a55d330
> > --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
> > 0xfe085a55d430 ---
> > g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> > g_pLogger() at 0x8274e5c7/frame 0x3
> > KDB: enter: panic
> > 
> > Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> > the trigger.
> > 
> > There are no symbols for the virtualbox kmods, possibly because I
> > installed them as an upgrade using packages (built with the same source
> > tree version) instead of by using PORTS_MODULES in make.conf, so ports
> > kgdb didn't have anything useful to say about what happened before the
> > trap.
> > 
> > This panic is very repeatable.  I just got another one when starting the
> > same VM., but this time the two calls before the trap were
> > null_bug_bypass().  Hmn, that symbol is in nullfs ...
> > 
> > I don't see this with a Windows 7 VM.
> > 
> > All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> > -msoft-float -mno-aes -mno-avx
Your disassemble listed fxrstor instruction that failing, or did I
mis-remembered ? This is most likely some context switch code, either
by virtual machine or erronously executed guest code. It is not a
spontaneous use of FPU, but more likely something different. Can you
confirm ?

In either case, I do not remember any KBI changes around PCB layout or
fpu_enter() KPI recently.

> 
> I suspect head packages are quite likely built against the a "wrong" KBI
> and are too fragile to use for kmods vs compiling from ports. :-/  I would
> try a built-from-ports kmod to see if the panics go away.

FWIW, I will commit the following change shortly. Since third-party
modules break the invariant, either due to bugs (ndis wrappers) or
possibly due to KBI breakage, it is worth to have the detection enabled
for production kernels.

diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
index 1b85b32..04c5dcc 100644
--- a/sys/amd64/amd64/trap.c
+++ b/sys/amd64/amd64/trap.c
@@ -443,8 +443,8 @@ trap(struct trapframe *frame)
goto out;
 
case T_DNA:
-   KASSERT(!PCB_USER_FPU(td->td_pcb),
-   ("Unregistered use of FPU in kernel"));
+   if (PCB_USER_FPU(td->td_pcb))
+   panic("Unregistered use of FPU in kernel");
fpudna();
goto out;
 
diff --git a/sys/i386/i386/trap.c b/sys/i386/i386/trap.c
index 40f7204..c540a49 100644
--- a/sys/i386/i386/trap.c
+++ b/sys/i386/i386/trap.c
@@ -540,8 +540,8 @@ trap(struct trapframe *frame)
 
case T_DNA:
 #ifdef DEV_NPX
-   KASSERT(!PCB_USER_FPU(td->td_pcb),
-   ("Unregistered use of FPU in kernel"));
+   if (PCB_USER_FPU(td->td_pcb))
+   panic("Unregistered use of FPU in kernel");
if (npxdna())
goto out;
 #endif
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic caused by virtualbox(?)

2016-08-08 Thread John Baldwin
On Thursday, August 04, 2016 05:10:29 PM Don Lewis wrote:
> Reposted to -current to get some more eyes on this ...
> 
> I just got a kernel panic when I started up a CentOS 7 VM in virtualbox.
> The host is:
>   FreeBSD 12.0-CURRENT #17 r302500 GENERIC amd64
> The virtualbox version is:
>   virtualbox-ose-5.0.26
>   virtualbox-ose-kmod-5.0.26_1
> 
> The panic message is:
> 
> panic: Unregistered use of FPU in kernel
> cpuid = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085a55d030
> vpanic() at vpanic+0x182/frame 0xfe085a55d0b0
> kassert_panic() at kassert_panic+0x126/frame 0xfe085a55d120
> trap() at trap+0x7ae/frame 0xfe085a55d330
> calltrap() at calltrap+0x8/frame 0xfe085a55d330
> --- trap 0x16, rip = 0x827dd3a9, rsp = 0xfe085a55d408, rbp = 
> 0xfe085a55d430 ---
> g_pLogger() at 0x827dd3a9/frame 0xfe085a55d430
> g_pLogger() at 0x8274e5c7/frame 0x3
> KDB: enter: panic
> 
> Since g_pLogger is a symbol in vboxdrv.ko, it looks like virtualbox is
> the trigger.
> 
> There are no symbols for the virtualbox kmods, possibly because I
> installed them as an upgrade using packages (built with the same source
> tree version) instead of by using PORTS_MODULES in make.conf, so ports
> kgdb didn't have anything useful to say about what happened before the
> trap.
> 
> This panic is very repeatable.  I just got another one when starting the
> same VM., but this time the two calls before the trap were
> null_bug_bypass().  Hmn, that symbol is in nullfs ...
> 
> I don't see this with a Windows 7 VM.
> 
> All of the virtualbox kmod files are compiled with -mno-mmx -mno-sse
> -msoft-float -mno-aes -mno-avx

I suspect head packages are quite likely built against the a "wrong" KBI
and are too fragile to use for kmods vs compiling from ports. :-/  I would
try a built-from-ports kmod to see if the panics go away.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"