Re: Fatal trap 18 on boot after OpenZFS import

2020-09-20 Thread Tomoaki AOKI
Forgot to mention here.

As I already mentioned on bugzilla, this problem is fixed at r365894.

Thanks again, Ryan and Matthew!


On Sun, 6 Sep 2020 18:02:40 +0900
Tomoaki AOKI  wrote:

> Filed PR.
> Bug 249147 - [ZFS][Panic]Fatal trap 18 on boot after OpenZFS import
> 
>  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249147
> 
> 
> On Fri, 4 Sep 2020 22:03:01 +0900
> Tomoaki AOKI  wrote:
> 
> > Hi.
> > 
> > Encountering boot failure with fatal trap 18 on boot,
> > happening at (maybe) just before init() starts. Possibly on
> > root remount by kernel or zpool import by rc.d script.
> > The last revision tried is r365316 (r364788 is the last tried
> > clean rebuild).
> > 
> > The last health revision is r364744, just before actual switch
> > to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU.
> > 
> > r364751 with diff of r364777 and r364788 (to successfully built
> > Without unrelated-to-OpenZFS changes) fails.
> > 
> > Any suggestions and fixes are appreciated.
> > 
> > 
> > Trap screen is something like below (text attached),
> > typed up from relatively clear photo, so could be some typo.
> > 
> > This is shown just after usual kernel startup outputs.
> > boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi
> > properly, and loader.efi seems to boot kernel properly.
> > 
> > As even single user shell selection doesn't appear, loader.efi
> > is of r364744. But they works even if I proceeded irregular
> > process,
> > 
> >   1)Update src tree
> >   2)Clean obj tree
> >   3)buildworld
> >   4)etcupdate -p
> >   5)buildkernel
> >   6)installkernel
> >   7)shutdown to single user WITHOUT reboot  <- Irregular!
> >   8)installworld
> >   9)etcupdate
> >  10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...)
> >  11)reboot
> > 
> > loader.efi looks doing its job and panics after kernel startup ends.
> > Needless to say, rolling back to r364744 state from stable/12 on nvd0
> > Fixes the issue.
> > 
> > Regards.
> > 
> > =
> > 
> > Fatal trap 18: integer divide fault while in kernel mode
> > cpuid = 2; apic id = 02
> > instruction pointer = 0x20:0x82bfa320
> > stack pointer   = 0x28:0xfe00e20c6900
> > frame pointer   = 0x28:0xfe00e20c6960
> > code segment= base 0x0, limit 0xf, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags= interrupt enabled, resume, IOPL = 0
> > current process = 27 (vdev_open)
> > trap number = 18
> > panic: integer divide fault
> > cpuid = 2
> > time = 16
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfe00e20c6610 vpanic() at vpanic+0x182/frame fe00e20c6660
> > panic() at panic+0x43/frame fe00e20c66c0
> > trap_fatal() at trap_fatal+0x387/frame fe00e20c6720
> > trap() at trap+0x8e/frame fe00e20c6830
> > calltrap() at calltrap+0x8/frame fe00e20c6830
> > --- trap 0x12, rip = 0x82bfa320, rsp = 0xfe00e20c6900, rbp
> > = 0xfe00e20c6960 --- zio_wait() at zio_wait+0x60/frame
> > 0xfe00e20c6960 vdev_open() at vdev_open+0x74d/frame
> > 0xfe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame
> > 0xfe00e20c69e0 taskq_run() at taskq_run+0x1f/frame
> > 0xfe00e20c6a00 taskqueue_run_locked() at
> > taskqueue_run_locked+0x181/frame 0xfe00e20c6a80
> > taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame
> > 0xfe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame
> > 0xfe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame
> > 0xfe00e20c6af0
> > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> > KDB: enter: panic
> > [ thread pid 27 tid 100570 ]
> > Stopped at  kdb_enter+0x37: movq$0,0x1091556(%rip)
> > db> 
> > 
> > =
> > 
> > Additional info:
> >  *Clean build with killing CPUTYPE from command line and
> >   make.conf (so should be equivalent with nocona) didn't help.
> > 
> >  *Clean build with commenting out WITH_KERNEL_RETPOLINE line
> >   and WITH_RETPOLINE line in src.conf didn't help.
> > 
> >  *Combination of the above two didn't help, too (at r364788).
> > 
> >  *There are two root pools in different physical drive.
> >   stable/12 on nvd0 (primary) and head on ada0 (secondary).
> > 
> >  *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC)
> >   kernel.
> > 
> > -- 
> > Tomoaki AOKI
> 
> 
> -- 
> Tomoaki AOKI
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


-- 
Tomoaki AOKI
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fatal trap 18 on boot after OpenZFS import

2020-09-06 Thread Tomoaki AOKI
Filed PR.
Bug 249147 - [ZFS][Panic]Fatal trap 18 on boot after OpenZFS import

 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249147


On Fri, 4 Sep 2020 22:03:01 +0900
Tomoaki AOKI  wrote:

> Hi.
> 
> Encountering boot failure with fatal trap 18 on boot,
> happening at (maybe) just before init() starts. Possibly on
> root remount by kernel or zpool import by rc.d script.
> The last revision tried is r365316 (r364788 is the last tried
> clean rebuild).
> 
> The last health revision is r364744, just before actual switch
> to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU.
> 
> r364751 with diff of r364777 and r364788 (to successfully built
> Without unrelated-to-OpenZFS changes) fails.
> 
> Any suggestions and fixes are appreciated.
> 
> 
> Trap screen is something like below (text attached),
> typed up from relatively clear photo, so could be some typo.
> 
> This is shown just after usual kernel startup outputs.
> boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi
> properly, and loader.efi seems to boot kernel properly.
> 
> As even single user shell selection doesn't appear, loader.efi
> is of r364744. But they works even if I proceeded irregular
> process,
> 
>   1)Update src tree
>   2)Clean obj tree
>   3)buildworld
>   4)etcupdate -p
>   5)buildkernel
>   6)installkernel
>   7)shutdown to single user WITHOUT reboot  <- Irregular!
>   8)installworld
>   9)etcupdate
>  10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...)
>  11)reboot
> 
> loader.efi looks doing its job and panics after kernel startup ends.
> Needless to say, rolling back to r364744 state from stable/12 on nvd0
> Fixes the issue.
> 
> Regards.
> 
> =
> 
> Fatal trap 18: integer divide fault while in kernel mode
> cpuid = 2; apic id = 02
> instruction pointer = 0x20:0x82bfa320
> stack pointer   = 0x28:0xfe00e20c6900
> frame pointer   = 0x28:0xfe00e20c6960
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 27 (vdev_open)
> trap number = 18
> panic: integer divide fault
> cpuid = 2
> time = 16
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe00e20c6610 vpanic() at vpanic+0x182/frame fe00e20c6660
> panic() at panic+0x43/frame fe00e20c66c0
> trap_fatal() at trap_fatal+0x387/frame fe00e20c6720
> trap() at trap+0x8e/frame fe00e20c6830
> calltrap() at calltrap+0x8/frame fe00e20c6830
> --- trap 0x12, rip = 0x82bfa320, rsp = 0xfe00e20c6900, rbp
> = 0xfe00e20c6960 --- zio_wait() at zio_wait+0x60/frame
> 0xfe00e20c6960 vdev_open() at vdev_open+0x74d/frame
> 0xfe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame
> 0xfe00e20c69e0 taskq_run() at taskq_run+0x1f/frame
> 0xfe00e20c6a00 taskqueue_run_locked() at
> taskqueue_run_locked+0x181/frame 0xfe00e20c6a80
> taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame
> 0xfe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame
> 0xfe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame
> 0xfe00e20c6af0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 27 tid 100570 ]
> Stopped at  kdb_enter+0x37: movq$0,0x1091556(%rip)
> db> 
> 
> =
> 
> Additional info:
>  *Clean build with killing CPUTYPE from command line and
>   make.conf (so should be equivalent with nocona) didn't help.
> 
>  *Clean build with commenting out WITH_KERNEL_RETPOLINE line
>   and WITH_RETPOLINE line in src.conf didn't help.
> 
>  *Combination of the above two didn't help, too (at r364788).
> 
>  *There are two root pools in different physical drive.
>   stable/12 on nvd0 (primary) and head on ada0 (secondary).
> 
>  *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC)
>   kernel.
> 
> -- 
> Tomoaki AOKI


-- 
Tomoaki AOKI
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Fatal trap 18 on boot after OpenZFS import

2020-09-04 Thread Tomoaki AOKI
Hi.

Encountering boot failure with fatal trap 18 on boot,
happening at (maybe) just before init() starts. Possibly on
root remount by kernel or zpool import by rc.d script.
The last revision tried is r365316 (r364788 is the last tried
clean rebuild).

The last health revision is r364744, just before actual switch
to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU.

r364751 with diff of r364777 and r364788 (to successfully built
Without unrelated-to-OpenZFS changes) fails.

Any suggestions and fixes are appreciated.


Trap screen is something like below (text attached),
typed up from relatively clear photo, so could be some typo.

This is shown just after usual kernel startup outputs.
boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi
properly, and loader.efi seems to boot kernel properly.

As even single user shell selection doesn't appear, loader.efi
is of r364744. But they works even if I proceeded irregular
process,

  1)Update src tree
  2)Clean obj tree
  3)buildworld
  4)etcupdate -p
  5)buildkernel
  6)installkernel
  7)shutdown to single user WITHOUT reboot  <- Irregular!
  8)installworld
  9)etcupdate
 10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...)
 11)reboot

loader.efi looks doing its job and panics after kernel startup ends.
Needless to say, rolling back to r364744 state from stable/12 on nvd0
Fixes the issue.

Regards.

=

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer = 0x20:0x82bfa320
stack pointer   = 0x28:0xfe00e20c6900
frame pointer   = 0x28:0xfe00e20c6960
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 27 (vdev_open)
trap number = 18
panic: integer divide fault
cpuid = 2
time = 16
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe00e20c6610 vpanic() at vpanic+0x182/frame fe00e20c6660
panic() at panic+0x43/frame fe00e20c66c0
trap_fatal() at trap_fatal+0x387/frame fe00e20c6720
trap() at trap+0x8e/frame fe00e20c6830
calltrap() at calltrap+0x8/frame fe00e20c6830
--- trap 0x12, rip = 0x82bfa320, rsp = 0xfe00e20c6900, rbp
= 0xfe00e20c6960 --- zio_wait() at zio_wait+0x60/frame
0xfe00e20c6960 vdev_open() at vdev_open+0x74d/frame
0xfe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame
0xfe00e20c69e0 taskq_run() at taskq_run+0x1f/frame
0xfe00e20c6a00 taskqueue_run_locked() at
taskqueue_run_locked+0x181/frame 0xfe00e20c6a80
taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame
0xfe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame
0xfe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame
0xfe00e20c6af0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 27 tid 100570 ]
Stopped at  kdb_enter+0x37: movq$0,0x1091556(%rip)
db> 

=

Additional info:
 *Clean build with killing CPUTYPE from command line and
  make.conf (so should be equivalent with nocona) didn't help.

 *Clean build with commenting out WITH_KERNEL_RETPOLINE line
  and WITH_RETPOLINE line in src.conf didn't help.

 *Combination of the above two didn't help, too (at r364788).

 *There are two root pools in different physical drive.
  stable/12 on nvd0 (primary) and head on ada0 (secondary).

 *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC)
  kernel.

-- 
Tomoaki AOKI


Fatal_trap_18_on_head_after_r364744.log
Description: Binary data
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"