Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
> On Nov 27, 2020, at 1:47 PM, Bakul Shah wrote: > > > >> On Nov 27, 2020, at 9:09 AM, Rebecca Cran wrote: >> >> On 11/27/20 4:29 AM, Hans Petter Selasky wrote: >>> >>> Is the problem always triggered by hald? If you disable hald in rc.conf, >>> does the system run for a longer period of time? >> >> It turns out that disabling ntpd let the system run for a longer period of >> time - until I ran "sysctl sys" at which point I got a panic. >> >> And this time the panic actually implicates amdgpu.ko, which is an >> improvement: >> >> >> #9 0x in ?? () >> #10 0x82a14c4e in amdgpu_device_get_pcie_replay_count () >> from /boot/modules/amdgpu.ko >> #11 0x82a14b80 in sysctl_handle_attr () from /boot/modules/amdgpu.ko >> >> #12 0x80c06cc1 in sysctl_root_handler_locked (oid=0xfe02133ff000, >>arg1=0xfe016e360980, arg2=-8724518803888, req=0xfe016e360980, >>tracker=0xf81099af6280) at /usr/src/sys/kern/kern_sysctl.c:184 >> #13 0x80c0610c in sysctl_root (oidp=, >>arg1=0xf810aa27e650, arg2=-2100190360, req=0xfe016e360980) >>at /usr/src/sys/kern/kern_sysctl.c:2211 >> >> >> Since it _is_ a problem in amdgpu, I'll stop this thread and re-post on >> freebsd-x11. > > FWIW, I am using amdgpu on a Ryzen 5 3500U system on a couple days old > -current (r368025). "sysctl sys" complains about "unknown oid 'sys'". > I am runing hald & ntpd. I had a few amdgpu related panics initially > but they vanished once I added > PORTS_MODULES=graphics/drm-devel-kmod > to /etc/src.conf to compile it along with the kernel. I am running > GENERIC-NODEBUG. The machine gets rebooted when I install a new kernel > (usually once a week). > > My guess is some weird interaction rather than something in amdgpu. To get sysctl sys working I compiled a GENERIC kernel from today's 368108 revision and so far there are no problems. $ sysctl sys.device.drmn0.pcie_replay_count sys.device.drmn0.pcie_replay_count: 0 sysctl -a also works. Last commit log on drm-devel-kmod (the last tiem may be what you're running into): Author: manu Date: Mon Nov 9 13:37:12 2020 + drm-current-kmod/drm-devel-kmod: Update to latest version - Use acpi code from base (thanks to wulf@) - Add radeon/i386 patches (thanks to tilj@) - Translate O_ flags for linuxulator (thanks to Greg V) - Lot of linuxkpi cleanup - Hack for amdgpu when the IP isn't init properly, this happens on one of my laptop with a dGPU. We still don't support it but we don't panic when we load amdgpu ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
> On Nov 27, 2020, at 9:09 AM, Rebecca Cran wrote: > > On 11/27/20 4:29 AM, Hans Petter Selasky wrote: >> >> Is the problem always triggered by hald? If you disable hald in rc.conf, >> does the system run for a longer period of time? > > It turns out that disabling ntpd let the system run for a longer period of > time - until I ran "sysctl sys" at which point I got a panic. > > And this time the panic actually implicates amdgpu.ko, which is an > improvement: > > > #9 0x in ?? () > #10 0x82a14c4e in amdgpu_device_get_pcie_replay_count () >from /boot/modules/amdgpu.ko > #11 0x82a14b80 in sysctl_handle_attr () from /boot/modules/amdgpu.ko > > #12 0x80c06cc1 in sysctl_root_handler_locked (oid=0xfe02133ff000, > arg1=0xfe016e360980, arg2=-8724518803888, req=0xfe016e360980, > tracker=0xf81099af6280) at /usr/src/sys/kern/kern_sysctl.c:184 > #13 0x80c0610c in sysctl_root (oidp=, > arg1=0xf810aa27e650, arg2=-2100190360, req=0xfe016e360980) > at /usr/src/sys/kern/kern_sysctl.c:2211 > > > Since it _is_ a problem in amdgpu, I'll stop this thread and re-post on > freebsd-x11. FWIW, I am using amdgpu on a Ryzen 5 3500U system on a couple days old -current (r368025). "sysctl sys" complains about "unknown oid 'sys'". I am runing hald & ntpd. I had a few amdgpu related panics initially but they vanished once I added PORTS_MODULES=graphics/drm-devel-kmod to /etc/src.conf to compile it along with the kernel. I am running GENERIC-NODEBUG. The machine gets rebooted when I install a new kernel (usually once a week). My guess is some weird interaction rather than something in amdgpu. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
On 11/27/20 11:10 AM, Konstantin Belousov wrote: And what is the instruction at 0x81002dcf ? I got a much clearer panic by running "sysctl sys" which shows it's more likely a problem for the amdgpu folks and not an underlying FreeBSD problem. #7 0x810295cd in trap (frame=0xfe016e360760) at /usr/src/sys/amd64/amd64/trap.c:398 #8 #9 0x in ?? () #10 0x82a14c4e in amdgpu_device_get_pcie_replay_count () from /boot/modules/amdgpu.ko #11 0x82a14b80 in sysctl_handle_attr () from /boot/modules/amdgpu.ko #12 0x80c06cc1 in sysctl_root_handler_locked (oid=0xfe02133ff000, arg1=0xfe016e360980, arg2=-8724518803888, req=0xfe016e360980, tracker=0xf81099af6280) at /usr/src/sys/kern/kern_sysctl.c:184 #13 0x80c0610c in sysctl_root (oidp=, arg1=0xf810aa27e650, arg2=-2100190360, req=0xfe016e360980) at /usr/src/sys/kern/kern_sysctl.c:2211 #14 0x80c06783 in userland_sysctl (td=0xfe00f00b6100, name=0xfe016e360a40, namelen=4, old=, oldlenp=, inkernel=, new=0x0, newlen=0, retval=0xfe016e360aa8, flags=0) at /usr/src/sys/kern/kern_sysctl.c:2368 #15 0x80c065cf in sys___sysctl (td=0xfe00f00b6100, uap=0xfe00f00b64e8) at /usr/src/sys/kern/kern_sysctl.c:2241 #16 0x8102a81c in syscallenter (td=0xfe00f00b6100) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #17 amd64_syscall (td=0xfe00f00b6100, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1156 #18 #19 0x0008003819ca in ?? () Backtrace stopped: Cannot access memory at address 0x7fffb618 (kgdb) ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
On Thu, Nov 26, 2020 at 10:09:24PM -0700, Rebecca Cran wrote: > I have a Threadripper 2990WX system that I recently installed an AMD Radeon > Pro W5700 into. It runs fine unless I load the amdgpu driver, at which point > it panics several seconds after boot: I have enough time to login and run a > few commands, but even if I just leave it it'll panic. I'm running: > > > FreeBSD photon.int.bluestop.org 13.0-CURRENT FreeBSD 13.0-CURRENT #0 > 6db1a3e8098-c273171(master): Thu Nov 26 01:26:17 MST 2020 > bc...@photon.int.bluestop.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG > amd64 > > > I rebuilt the drm-current-kmod-5.4.62.g20201109_1 port today. > > > The panic is: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 24; apic id = 18 > instruction pointer = 0x20:0x81002dcf > stack pointer = 0x0:0xfe016e6ffaa0 > frame pointer = 0x0:0xfe016e6ffaa0 > code segment = base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 4372 (hald) > trap number = 9 > panic: general protection fault > cpuid = 24 > time = 1606450595 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe016e6ff7b0 > vpanic() at vpanic+0x181/frame 0xfe016e6ff800 > panic() at panic+0x43/frame 0xfe016e6ff860 > trap_fatal() at trap_fatal+0x387/frame 0xfe016e6ff8c0 > trap() at trap+0x8e/frame 0xfe016e6ff9d0 > calltrap() at calltrap+0x8/frame 0xfe016e6ff9d0 > --- trap 0x9, rip = 0x81002dcf, rsp = 0xfe016e6ffaa0, rbp = > 0xfe016e6ffaa0 --- > fpurestore_xrstor3264() at fpurestore_xrstor3264+0x2f/frame 0xfe016e6ffaa0 > restore_fpu_curthread() at restore_fpu_curthread+0x85/frame 0xfe016e6ffac0 > fpudna() at fpudna+0x3a/frame 0xfe016e6ffae0 > trap() at trap+0x246/frame 0xfe016e6ffbf0 > calltrap() at calltrap+0x8/frame 0xfe016e6ffbf0 > --- trap 0x16, rip = 0x80067137f, rsp = 0x7fffd8b0, rbp = 0x7fffd8f0 > --- > Uptime: 1m4s > Dumping 4193 out of 130894 > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > > I've uploaded details (core.txt, dmesg.txt etc.) to > https://bsdio.com/freebsd/crashes/2020-11-26-amdgpu/ and the vmcore file is > available on request. And what is the instruction at 0x81002dcf ? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
On 11/27/20 4:29 AM, Hans Petter Selasky wrote: Is the problem always triggered by hald? If you disable hald in rc.conf, does the system run for a longer period of time? It turns out that disabling ntpd let the system run for a longer period of time - until I ran "sysctl sys" at which point I got a panic. And this time the panic actually implicates amdgpu.ko, which is an improvement: #9 0x in ?? () #10 0x82a14c4e in amdgpu_device_get_pcie_replay_count () from /boot/modules/amdgpu.ko #11 0x82a14b80 in sysctl_handle_attr () from /boot/modules/amdgpu.ko #12 0x80c06cc1 in sysctl_root_handler_locked (oid=0xfe02133ff000, arg1=0xfe016e360980, arg2=-8724518803888, req=0xfe016e360980, tracker=0xf81099af6280) at /usr/src/sys/kern/kern_sysctl.c:184 #13 0x80c0610c in sysctl_root (oidp=, arg1=0xf810aa27e650, arg2=-2100190360, req=0xfe016e360980) at /usr/src/sys/kern/kern_sysctl.c:2211 Since it _is_ a problem in amdgpu, I'll stop this thread and re-post on freebsd-x11. -- Rebecca Cran ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
On 11/27/20 6:09 AM, Rebecca Cran wrote: I have a Threadripper 2990WX system that I recently installed an AMD Radeon Pro W5700 into. It runs fine unless I load the amdgpu driver, at which point it panics several seconds after boot: I have enough time to login and run a few commands, but even if I just leave it it'll panic. I'm running: FreeBSD photon.int.bluestop.org 13.0-CURRENT FreeBSD 13.0-CURRENT #0 6db1a3e8098-c273171(master): Thu Nov 26 01:26:17 MST 2020 bc...@photon.int.bluestop.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 I rebuilt the drm-current-kmod-5.4.62.g20201109_1 port today. The panic is: Fatal trap 9: general protection fault while in kernel mode cpuid = 24; apic id = 18 instruction pointer = 0x20:0x81002dcf stack pointer = 0x0:0xfe016e6ffaa0 frame pointer = 0x0:0xfe016e6ffaa0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 4372 (hald) trap number = 9 panic: general protection fault cpuid = 24 time = 1606450595 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe016e6ff7b0 vpanic() at vpanic+0x181/frame 0xfe016e6ff800 panic() at panic+0x43/frame 0xfe016e6ff860 trap_fatal() at trap_fatal+0x387/frame 0xfe016e6ff8c0 trap() at trap+0x8e/frame 0xfe016e6ff9d0 calltrap() at calltrap+0x8/frame 0xfe016e6ff9d0 --- trap 0x9, rip = 0x81002dcf, rsp = 0xfe016e6ffaa0, rbp = 0xfe016e6ffaa0 --- fpurestore_xrstor3264() at fpurestore_xrstor3264+0x2f/frame 0xfe016e6ffaa0 restore_fpu_curthread() at restore_fpu_curthread+0x85/frame 0xfe016e6ffac0 fpudna() at fpudna+0x3a/frame 0xfe016e6ffae0 trap() at trap+0x246/frame 0xfe016e6ffbf0 calltrap() at calltrap+0x8/frame 0xfe016e6ffbf0 --- trap 0x16, rip = 0x80067137f, rsp = 0x7fffd8b0, rbp = 0x7fffd8f0 --- Uptime: 1m4s Dumping 4193 out of 130894 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% I've uploaded details (core.txt, dmesg.txt etc.) to https://bsdio.com/freebsd/crashes/2020-11-26-amdgpu/ and the vmcore file is available on request. Hi, Is the problem always triggered by hald? If you disable hald in rc.conf, does the system run for a longer period of time? --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
panic shortly after boot when amdgpu.ko is loaded (fpu?)
I have a Threadripper 2990WX system that I recently installed an AMD Radeon Pro W5700 into. It runs fine unless I load the amdgpu driver, at which point it panics several seconds after boot: I have enough time to login and run a few commands, but even if I just leave it it'll panic. I'm running: FreeBSD photon.int.bluestop.org 13.0-CURRENT FreeBSD 13.0-CURRENT #0 6db1a3e8098-c273171(master): Thu Nov 26 01:26:17 MST 2020 bc...@photon.int.bluestop.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 I rebuilt the drm-current-kmod-5.4.62.g20201109_1 port today. The panic is: Fatal trap 9: general protection fault while in kernel mode cpuid = 24; apic id = 18 instruction pointer = 0x20:0x81002dcf stack pointer = 0x0:0xfe016e6ffaa0 frame pointer = 0x0:0xfe016e6ffaa0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 4372 (hald) trap number = 9 panic: general protection fault cpuid = 24 time = 1606450595 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe016e6ff7b0 vpanic() at vpanic+0x181/frame 0xfe016e6ff800 panic() at panic+0x43/frame 0xfe016e6ff860 trap_fatal() at trap_fatal+0x387/frame 0xfe016e6ff8c0 trap() at trap+0x8e/frame 0xfe016e6ff9d0 calltrap() at calltrap+0x8/frame 0xfe016e6ff9d0 --- trap 0x9, rip = 0x81002dcf, rsp = 0xfe016e6ffaa0, rbp = 0xfe016e6ffaa0 --- fpurestore_xrstor3264() at fpurestore_xrstor3264+0x2f/frame 0xfe016e6ffaa0 restore_fpu_curthread() at restore_fpu_curthread+0x85/frame 0xfe016e6ffac0 fpudna() at fpudna+0x3a/frame 0xfe016e6ffae0 trap() at trap+0x246/frame 0xfe016e6ffbf0 calltrap() at calltrap+0x8/frame 0xfe016e6ffbf0 --- trap 0x16, rip = 0x80067137f, rsp = 0x7fffd8b0, rbp = 0x7fffd8f0 --- Uptime: 1m4s Dumping 4193 out of 130894 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% I've uploaded details (core.txt, dmesg.txt etc.) to https://bsdio.com/freebsd/crashes/2020-11-26-amdgpu/ and the vmcore file is available on request. -- Rebecca Cran ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"