RE: strange kernel crash

2015-11-11 Thread Andrew Duane
> -Original Message-
> From: owner-freebsd-hack...@freebsd.org 
> [mailto:owner-freebsd-hack...@freebsd.org] On Behalf Of Andriy Gapon
> Sent: Wednesday, November 11, 2015 3:02 AM
> To: John Baldwin 
> Cc: Hans Petter Selasky ; FreeBSD Hackers 
> ; freebsd-current@FreeBSD.org
> Subject: Re: strange kernel crash
> 
> On 10/11/2015 20:42, John Baldwin wrote:
> > On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote:
> >> On 09/11/2015 22:16, John Baldwin wrote:
> >>> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
> >>>> On 11/06/15 12:20, Andriy Gapon wrote:
> >>>>> Now the strange part:
> >>>>>
> >>>>> 0x80619a18 <+744>:   jne0x80619a61 
> >>>>> <__mtx_lock_flags+817>
> >>>>> 0x80619a1a <+746>:   mov%rbx,(%rsp)
> >>>>> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
> >>>>> 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
> >>>>> 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> >>>>
> >>>> Were these instructions dumped from RAM or from the kernel ELF file?
> >>>
> >>> Probably not from RAM.  You can use 'info files' in gdb to see what
> >>> is handling the address range in question (core vs executable).  x/i
> >>> in ddb would have been the "real" truth.
> >>
> >> Yes, according to the output of files it looks like gdb would read
> >> that data from the text section of the kernel file.
> >>
> >> How about libkvm?  Would kvm_read read data from the core file?
> >
> > kvm_read should only access the vmcore, yes.
> >
> >> I've written the following small program (cut down dmesg.c, actually):
> >> https://people.freebsd.org/~avg/vmcore_read.c
> >>
> >> (kgdb) disassemble /r
> >> => 0x80619a1e <+750>:   48 c7 44 24 18 00 00 00 00  movq
> >> $0x0,0x18(%rsp)
> >>
> >> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29
> >> 0x80619a1e 9
> >> 48 c7 44 24 18 00 00 00 00
> >>
> >> Seems like the code is intact.
> >>
> >> P.S.
> >> 1. To correct something I said earlier, the fault is #UD, not #GP.
> >> 2. The only "suspicious" activity at the time of the crash was the 
> >> execution of a bhyve VM.
> >
> > Was the crash in the guest or the host?  UD# seems even more bizarre.
> 
> It was the host.  This is bizarre indeed.  I can think only of two 
> possibilities:
>   - new CPU erratum
>   - corrupted data somehow getting into the instruction cache, but the 
> correct data being read during the crash dump (i.e. flaky memory)

Or perhaps a missing memory sync operation somewhere

> 
> --
> Andriy Gapon
> ___
> freebsd-hack...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-11 Thread Andriy Gapon
On 10/11/2015 20:42, John Baldwin wrote:
> On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote:
>> On 09/11/2015 22:16, John Baldwin wrote:
>>> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
 On 11/06/15 12:20, Andriy Gapon wrote:
> Now the strange part:
>
> 0x80619a18 <+744>:   jne0x80619a61 
> <__mtx_lock_flags+817>
> 0x80619a1a <+746>:   mov%rbx,(%rsp)
> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
> 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
> 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)

 Were these instructions dumped from RAM or from the kernel ELF file?
>>>
>>> Probably not from RAM.  You can use 'info files' in gdb to see what is
>>> handling the address range in question (core vs executable).  x/i in ddb
>>> would have been the "real" truth.
>>
>> Yes, according to the output of files it looks like gdb would read that data
>> from the text section of the kernel file.
>>
>> How about libkvm?  Would kvm_read read data from the core file?
> 
> kvm_read should only access the vmcore, yes.
> 
>> I've written the following small program (cut down dmesg.c, actually):
>> https://people.freebsd.org/~avg/vmcore_read.c
>>
>> (kgdb) disassemble /r
>> => 0x80619a1e <+750>:   48 c7 44 24 18 00 00 00 00  movq
>> $0x0,0x18(%rsp)
>>
>> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29 
>> 0x80619a1e 9
>> 48 c7 44 24 18 00 00 00 00
>>
>> Seems like the code is intact.
>>
>> P.S.
>> 1. To correct something I said earlier, the fault is #UD, not #GP.
>> 2. The only "suspicious" activity at the time of the crash was the execution 
>> of
>> a bhyve VM.
> 
> Was the crash in the guest or the host?  UD# seems even more bizarre.

It was the host.  This is bizarre indeed.  I can think only of two 
possibilities:
- new CPU erratum
- corrupted data somehow getting into the instruction cache, but the correct
data being read during the crash dump (i.e. flaky memory)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-10 Thread John Baldwin
On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote:
> On 09/11/2015 22:16, John Baldwin wrote:
> > On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
> >> On 11/06/15 12:20, Andriy Gapon wrote:
> >>> Now the strange part:
> >>>
> >>> 0x80619a18 <+744>:   jne0x80619a61 
> >>> <__mtx_lock_flags+817>
> >>> 0x80619a1a <+746>:   mov%rbx,(%rsp)
> >>> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
> >>> 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
> >>> 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> >>
> >> Were these instructions dumped from RAM or from the kernel ELF file?
> > 
> > Probably not from RAM.  You can use 'info files' in gdb to see what is
> > handling the address range in question (core vs executable).  x/i in ddb
> > would have been the "real" truth.
> 
> Yes, according to the output of files it looks like gdb would read that data
> from the text section of the kernel file.
> 
> How about libkvm?  Would kvm_read read data from the core file?

kvm_read should only access the vmcore, yes.

> I've written the following small program (cut down dmesg.c, actually):
> https://people.freebsd.org/~avg/vmcore_read.c
> 
> (kgdb) disassemble /r
> => 0x80619a1e <+750>:   48 c7 44 24 18 00 00 00 00  movq
> $0x0,0x18(%rsp)
> 
> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29 
> 0x80619a1e 9
> 48 c7 44 24 18 00 00 00 00
> 
> Seems like the code is intact.
> 
> P.S.
> 1. To correct something I said earlier, the fault is #UD, not #GP.
> 2. The only "suspicious" activity at the time of the crash was the execution 
> of
> a bhyve VM.

Was the crash in the guest or the host?  UD# seems even more bizarre.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-10 Thread Andriy Gapon
On 09/11/2015 22:16, John Baldwin wrote:
> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
>> On 11/06/15 12:20, Andriy Gapon wrote:
>>> Now the strange part:
>>>
>>> 0x80619a18 <+744>:   jne0x80619a61 
>>> <__mtx_lock_flags+817>
>>> 0x80619a1a <+746>:   mov%rbx,(%rsp)
>>> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
>>> 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
>>> 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
>>
>> Were these instructions dumped from RAM or from the kernel ELF file?
> 
> Probably not from RAM.  You can use 'info files' in gdb to see what is
> handling the address range in question (core vs executable).  x/i in ddb
> would have been the "real" truth.

Yes, according to the output of files it looks like gdb would read that data
from the text section of the kernel file.

How about libkvm?  Would kvm_read read data from the core file?

I've written the following small program (cut down dmesg.c, actually):
https://people.freebsd.org/~avg/vmcore_read.c

(kgdb) disassemble /r
=> 0x80619a1e <+750>:   48 c7 44 24 18 00 00 00 00  movq
$0x0,0x18(%rsp)

$ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29 
0x80619a1e 9
48 c7 44 24 18 00 00 00 00

Seems like the code is intact.

P.S.
1. To correct something I said earlier, the fault is #UD, not #GP.
2. The only "suspicious" activity at the time of the crash was the execution of
a bhyve VM.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-09 Thread John Baldwin
On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
> On 11/06/15 12:20, Andriy Gapon wrote:
> > Now the strange part:
> >
> > 0x80619a18 <+744>:   jne0x80619a61 
> > <__mtx_lock_flags+817>
> > 0x80619a1a <+746>:   mov%rbx,(%rsp)
> > => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
> > 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
> > 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> 
> Were these instructions dumped from RAM or from the kernel ELF file?

Probably not from RAM.  You can use 'info files' in gdb to see what is
handling the address range in question (core vs executable).  x/i in ddb
would have been the "real" truth.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-07 Thread Don Lewis
On  7 Nov, To: bdrew...@freebsd.org wrote:
> On  7 Nov, Bryan Drewery wrote:
>> On 11/7/2015 3:08 PM, Andriy Gapon wrote:
>>> On 07/11/2015 21:48, Bryan Drewery wrote:
 On 11/6/15 3:20 AM, Andriy Gapon wrote:
> Unread portion of the kernel message buffer:
>
> Fatal trap 1: privileged instruction fault while in kernel mode
 ...

 What SVN revision?
>>> 
>>> r289875 from October 24.
>>> 
>> 
>> I ask because I fixed an uninitialized ptr write in the kernel in
>> r290155 on October 29th. I would only expect it to happen in low memory
>> situations.
> 
> I haven't seen it since I upgraded from r290039 Tue Oct 27 to r290224
> Sat Oct 31.  I can believe that low memory would trigger it since I
> often it uses zfs and I sometimes drive it heavily into swap (using
> tmpfs) when running poudriere with parallel builds.

Wow, that sentence makes no sense at all ... change it to:

I can believe that low memory would trigger it since that machine uses
zfs and I often drive it heavily into swap (using tmpfs) when running
poudriere with parallel builds.

> The timing makes
> sense because the bug was introduce by r289279, and before r290039 I had
> been running r289123 without having any problems.
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-07 Thread Don Lewis
On  7 Nov, Bryan Drewery wrote:
> On 11/7/2015 3:08 PM, Andriy Gapon wrote:
>> On 07/11/2015 21:48, Bryan Drewery wrote:
>>> On 11/6/15 3:20 AM, Andriy Gapon wrote:
 Unread portion of the kernel message buffer:

 Fatal trap 1: privileged instruction fault while in kernel mode
>>> ...
>>>
>>> What SVN revision?
>> 
>> r289875 from October 24.
>> 
> 
> I ask because I fixed an uninitialized ptr write in the kernel in
> r290155 on October 29th. I would only expect it to happen in low memory
> situations.

I haven't seen it since I upgraded from r290039 Tue Oct 27 to r290224
Sat Oct 31.  I can believe that low memory would trigger it since I
often it uses zfs and I sometimes drive it heavily into swap (using
tmpfs) when running poudriere with parallel builds.  The timing makes
sense because the bug was introduce by r289279, and before r290039 I had
been running r289123 without having any problems.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-07 Thread Bryan Drewery
On 11/7/2015 3:08 PM, Andriy Gapon wrote:
> On 07/11/2015 21:48, Bryan Drewery wrote:
>> On 11/6/15 3:20 AM, Andriy Gapon wrote:
>>> Unread portion of the kernel message buffer:
>>>
>>> Fatal trap 1: privileged instruction fault while in kernel mode
>> ...
>>
>> What SVN revision?
> 
> r289875 from October 24.
> 

I ask because I fixed an uninitialized ptr write in the kernel in
r290155 on October 29th. I would only expect it to happen in low memory
situations.

-- 
Regards,
Bryan Drewery



signature.asc
Description: OpenPGP digital signature


Re: strange kernel crash

2015-11-07 Thread Andriy Gapon
On 07/11/2015 21:48, Bryan Drewery wrote:
> On 11/6/15 3:20 AM, Andriy Gapon wrote:
>> Unread portion of the kernel message buffer:
>>
>> Fatal trap 1: privileged instruction fault while in kernel mode
> ...
> 
> What SVN revision?

r289875 from October 24.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-07 Thread Bryan Drewery
On 11/6/15 3:20 AM, Andriy Gapon wrote:
> Unread portion of the kernel message buffer:
> 
> Fatal trap 1: privileged instruction fault while in kernel mode
...

What SVN revision?

-- 
Regards,
Bryan Drewery
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-06 Thread Andriy Gapon
On 06/11/2015 20:02, Hans Petter Selasky wrote:
> On 11/06/15 12:20, Andriy Gapon wrote:
>> Now the strange part:
>>
>> 0x80619a18 <+744>:   jne0x80619a61 
>> <__mtx_lock_flags+817>
>> 0x80619a1a <+746>:   mov%rbx,(%rsp)
>> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
>> 0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
>> 0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> 
> Were these instructions dumped from RAM or from the kernel ELF file?

Whatever minidump and kgdb (libkvm) do for the text section.
Just in case, in addition to 'disassemble' I also did this:

(kgdb) x/i 0x80619a1e
=> 0x80619a1e <__mtx_lock_flags+750>:   movq   $0x0,0x18(%rsp)


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-06 Thread Hans Petter Selasky

On 11/06/15 12:20, Andriy Gapon wrote:

Now the strange part:

0x80619a18 <+744>:   jne0x80619a61 
<__mtx_lock_flags+817>
0x80619a1a <+746>:   mov%rbx,(%rsp)
=> 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)


Were these instructions dumped from RAM or from the kernel ELF file?

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-06 Thread Don Lewis
On  6 Nov, Konstantin Belousov wrote:
> On Fri, Nov 06, 2015 at 01:20:13PM +0200, Andriy Gapon wrote:
>> Unread portion of the kernel message buffer:
>> 
>> Fatal trap 1: privileged instruction fault while in kernel mode
>> cpuid = 0; apic id = 00
>> instruction pointer = 0x20:0x80619a1e
>> stack pointer   = 0x28:0xfe04f57856f0
>> frame pointer   = 0x28:0xfe04f57857b0
>> code segment= base 0x0, limit 0xf, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags= interrupt enabled, resume, IOPL = 0
>> current process = 2658 (firefox)
>> trap number = 1
>> panic: privileged instruction fault
>> cpuid = 0
>> curthread: 0xf803270b6000
>> stack: 0xfe04f5782000 - 0xfe04f5786000
>> stack pointer: 0xfe04f5785320
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at 0x8041e86b = 
>> db_trace_self_wrapper+0x2b/frame
>> 0xfe04f5785250
>> kdb_backtrace() at 0x80669f39 = kdb_backtrace+0x39/frame 
>> 0xfe04f5785300
>> vpanic() at 0x8063531c = vpanic+0x14c/frame 0xfe04f5785340
>> panic() at 0x80635063 = panic+0x43/frame 0xfe04f57853a0
>> trap_fatal() at 0x8081fc0f = trap_fatal+0x33f/frame 
>> 0xfe04f5785400
>> trap() at 0x8081f872 = trap+0x7d2/frame 0xfe04f5785610
>> trap_check() at 0x8081ff2a = trap_check+0x2a/frame 0xfe04f5785630
>> calltrap() at 0x80807ea0 = calltrap+0x8/frame 0xfe04f5785630
>> --- trap 0x1, rip = 0x80619a1e, rsp = 0xfe04f5785700, rbp =
>> 0xfe04f57857b0 ---
>> __mtx_lock_flags() at 0x80619a1e = __mtx_lock_flags+0x2ee/frame
>> 0xfe04f57857b0
>> uma_dbg_getslab() at 0x807df15c = uma_dbg_getslab+0x3c/frame
>> 0xfe04f57857d0
>> uma_dbg_alloc() at 0x807df08d = uma_dbg_alloc+0x2d/frame 
>> 0xfe04f5785800
>> uma_zalloc_arg() at 0x807dacf1 = uma_zalloc_arg+0x4b1/frame
>> 0xfe04f5785890
>> uma_zalloc() at 0x8068b040 = uma_zalloc+0x10/frame 0xfe04f57858a0
>> selfdalloc() at 0x8068aa12 = selfdalloc+0x22/frame 0xfe04f57858c0
>> pollscan() at 0x8068a615 = pollscan+0x95/frame 0xfe04f5785910
>> kern_poll() at 0x8068a4b1 = kern_poll+0x1f1/frame 0xfe04f5785a70
>> sys_poll() at 0x8068a2b9 = sys_poll+0x79/frame 0xfe04f5785a90
>> syscallenter() at 0x80820560 = syscallenter+0x320/frame 
>> 0xfe04f5785b00
>> amd64_syscall() at 0x8082012f = amd64_syscall+0x1f/frame 
>> 0xfe04f5785bf0
>> Xfast_syscall() at 0x8080818b = Xfast_syscall+0xfb/frame 
>> 0xfe04f5785bf0
>> --- syscall (209, FreeBSD ELF64, sys_poll), rip = 0x80146342a, rsp =
>> 0x7fffd8e8, rbp = 0x7fffd920 ---
>> Uptime: 1d12h57m32s
>> 
>> 
>> Now the strange part:
>> 
>>0x80619a18 <+744>:   jne0x80619a61 
>> <__mtx_lock_flags+817>
>>0x80619a1a <+746>:   mov%rbx,(%rsp)
>> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
>>0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
>>0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
>> 
>> RSP value seems to be sane and consistent with the stack information above:
>> (kgdb) i reg
>> rax0x4  4
>> rbx0xf80126ea54f0   -8791145163536
>> rcx0x8099a600   -2137414144
>> rdx0xf803270b6000   -8782553063424
>> rsi0x4  4
>> rdi0xf80027f41318   -8795422715112
>> rbp0xfe04f57857b0   0xfe04f57857b0
>> rsp0xfe04f5785700   0xfe04f5785700
>> r8 0x809a7727   -2137360601
>> r9 0xf80126ea54f0   -8791145163536
>> r100x3e81000
>> r110xfe04f5785cc0   -2177725080384
>> r120x1  1
>> r130xf803270b6000   -8782553063424
>> r140xf80027f41318   -8795422715112
>> r150x0  0
>> rip0x80619a1e   0x80619a1e 
>> <__mtx_lock_flags+750>
>> eflags 0x10246  [ PF ZF IF RF ]
>> cs 0x20 32
>> ss 0x28 40
>> ds 
>> es 
>> fs 
>> gs 
>> 
>> (kgdb) x/a $rsp
>> 0xfe04f5785700: 0xf80126ea54f0
>> (kgdb) x/a $rsp + 0x18
>> 0xfe04f5785718: 0x0
>> 
>> I have no idea what could have caused the #GP.  This is certainly not a stack
>> overflow.
> 
> This is a second report, please take a look at
> https://lists.freebsd.org/pipermail/freebsd-current/2015-October/057975.html
> I have no idea as well.

Whatever the problem is, it appears to be hard to trigger.  I haven't
had a recurrence.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsub

Re: strange kernel crash

2015-11-06 Thread Andriy Gapon
On 06/11/2015 17:35, Konstantin Belousov wrote:
> This is a second report, please take a look at
> https://lists.freebsd.org/pipermail/freebsd-current/2015-October/057975.html
> I have no idea as well.

In my case it does not look like the code was overwritten, though.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: strange kernel crash

2015-11-06 Thread Konstantin Belousov
On Fri, Nov 06, 2015 at 01:20:13PM +0200, Andriy Gapon wrote:
> Unread portion of the kernel message buffer:
> 
> Fatal trap 1: privileged instruction fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer = 0x20:0x80619a1e
> stack pointer   = 0x28:0xfe04f57856f0
> frame pointer   = 0x28:0xfe04f57857b0
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 2658 (firefox)
> trap number = 1
> panic: privileged instruction fault
> cpuid = 0
> curthread: 0xf803270b6000
> stack: 0xfe04f5782000 - 0xfe04f5786000
> stack pointer: 0xfe04f5785320
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0x8041e86b = 
> db_trace_self_wrapper+0x2b/frame
> 0xfe04f5785250
> kdb_backtrace() at 0x80669f39 = kdb_backtrace+0x39/frame 
> 0xfe04f5785300
> vpanic() at 0x8063531c = vpanic+0x14c/frame 0xfe04f5785340
> panic() at 0x80635063 = panic+0x43/frame 0xfe04f57853a0
> trap_fatal() at 0x8081fc0f = trap_fatal+0x33f/frame 0xfe04f5785400
> trap() at 0x8081f872 = trap+0x7d2/frame 0xfe04f5785610
> trap_check() at 0x8081ff2a = trap_check+0x2a/frame 0xfe04f5785630
> calltrap() at 0x80807ea0 = calltrap+0x8/frame 0xfe04f5785630
> --- trap 0x1, rip = 0x80619a1e, rsp = 0xfe04f5785700, rbp =
> 0xfe04f57857b0 ---
> __mtx_lock_flags() at 0x80619a1e = __mtx_lock_flags+0x2ee/frame
> 0xfe04f57857b0
> uma_dbg_getslab() at 0x807df15c = uma_dbg_getslab+0x3c/frame
> 0xfe04f57857d0
> uma_dbg_alloc() at 0x807df08d = uma_dbg_alloc+0x2d/frame 
> 0xfe04f5785800
> uma_zalloc_arg() at 0x807dacf1 = uma_zalloc_arg+0x4b1/frame
> 0xfe04f5785890
> uma_zalloc() at 0x8068b040 = uma_zalloc+0x10/frame 0xfe04f57858a0
> selfdalloc() at 0x8068aa12 = selfdalloc+0x22/frame 0xfe04f57858c0
> pollscan() at 0x8068a615 = pollscan+0x95/frame 0xfe04f5785910
> kern_poll() at 0x8068a4b1 = kern_poll+0x1f1/frame 0xfe04f5785a70
> sys_poll() at 0x8068a2b9 = sys_poll+0x79/frame 0xfe04f5785a90
> syscallenter() at 0x80820560 = syscallenter+0x320/frame 
> 0xfe04f5785b00
> amd64_syscall() at 0x8082012f = amd64_syscall+0x1f/frame 
> 0xfe04f5785bf0
> Xfast_syscall() at 0x8080818b = Xfast_syscall+0xfb/frame 
> 0xfe04f5785bf0
> --- syscall (209, FreeBSD ELF64, sys_poll), rip = 0x80146342a, rsp =
> 0x7fffd8e8, rbp = 0x7fffd920 ---
> Uptime: 1d12h57m32s
> 
> 
> Now the strange part:
> 
>0x80619a18 <+744>:   jne0x80619a61 
> <__mtx_lock_flags+817>
>0x80619a1a <+746>:   mov%rbx,(%rsp)
> => 0x80619a1e <+750>:   movq   $0x0,0x18(%rsp)
>0x80619a27 <+759>:   movq   $0x0,0x10(%rsp)
>0x80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> 
> RSP value seems to be sane and consistent with the stack information above:
> (kgdb) i reg
> rax0x4  4
> rbx0xf80126ea54f0   -8791145163536
> rcx0x8099a600   -2137414144
> rdx0xf803270b6000   -8782553063424
> rsi0x4  4
> rdi0xf80027f41318   -8795422715112
> rbp0xfe04f57857b0   0xfe04f57857b0
> rsp0xfe04f5785700   0xfe04f5785700
> r8 0x809a7727   -2137360601
> r9 0xf80126ea54f0   -8791145163536
> r100x3e81000
> r110xfe04f5785cc0   -2177725080384
> r120x1  1
> r130xf803270b6000   -8782553063424
> r140xf80027f41318   -8795422715112
> r150x0  0
> rip0x80619a1e   0x80619a1e 
> <__mtx_lock_flags+750>
> eflags 0x10246  [ PF ZF IF RF ]
> cs 0x20 32
> ss 0x28 40
> ds 
> es 
> fs 
> gs 
> 
> (kgdb) x/a $rsp
> 0xfe04f5785700: 0xf80126ea54f0
> (kgdb) x/a $rsp + 0x18
> 0xfe04f5785718: 0x0
> 
> I have no idea what could have caused the #GP.  This is certainly not a stack
> overflow.

This is a second report, please take a look at
https://lists.freebsd.org/pipermail/freebsd-current/2015-October/057975.html
I have no idea as well.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"