Re: Sun Ultra 45: Kernel Panic (corrupted stack end detected inside scheduler) with 5.19
I'm also experiencing this problem with the debian-12.0.0-sparc64-NETINST-1.iso from 2023-05-16 on a SPARC T4-2 (sun4v). Logs are as below; a Debian 11 image from last year works fine (unfortunately with an unknown burn date). > Loading ... > > [ 2.259595] niu 0001:0a:00.0: can't ioremap BAR 0: [mem size 0x0100 64bit] > [ 2.273835] niu 0001:0a:00.0: Cannot map device registers, aborting > [ 2.288904] Kernel panic - not syncing: corrupted stack end detected inside scheduler > [ 2.304269] CPU: 0 PID: 92 Comm: (udev-worker) Not tainted 6.1.0-9-sparc64 #1 Debian 6.1.27-1 > [ 2.321462] Call Trace: > [ 2.326321] [<00caaf50>] dump_stack+0x8/0x18 > [ 2.336221] [<00ca0dd8>] panic+0xec/0x344 > [ 2.345591] [<00cacdc4>] switch_to_pc+0x4ac/0x4c8 > [ 2.356363] [<00cad0f4>] __cond_resched+0x34/0x60 > [ 2.367134] [<006914a8>] __kmem_cache_alloc_node+0x468/0x520 > [ 2.379801] [<00635660>] kmalloc_trace+0x20/0xa0 > [ 2.390401] [<100f48e4>] usb_control_msg+0x24/0x120 [usbcore] > [ 2.403257] [<100e7304>] hub_power_on+0x64/0x180 [usbcore] > [ 2.415582] [<100e7e8c>] hub_activate+0x7ac/0x920 [usbcore] > [ 2.428078] [<100ef500>] hub_probe+0xf60/0xfc0 [usbcore] > [ 2.440062] [<100f994c>] usb_probe_interface+0x14c/0x340 [usbcore] > [ 2.453789] [<00994190>] really_probe+0x290/0x440 > [ 2.464542] [<009943cc>] __driver_probe_device+0x8c/0x180 > [ 2.476697] [<009944e8>] driver_probe_device+0x28/0xe0 > [ 2.488340] [<00994c98>] __device_attach_driver+0x98/0x120 > [ 2.500665] [<0099188c>] bus_for_each_drv+0x6c/0xc0 > [ 2.511894] Press Stop-A (L1-A) from sun keyboard or send break > [ 2.511894] twice on console to return to the boot prom > [ 2.534001] ---[ end Kernel panic - not syncing: corrupted stack end detected inside scheduler ]---
Re: Sun Ultra 45: Kernel Panic (corrupted stack end detected inside scheduler) with 5.19
(Since I didn't hit reply all on the previous one, whoops I'm not so good at mailing lists) Thanks for the advice. The 5.16 CD is stable enough to pass the installation (aside from a video text bug), however in rebooting it blows up. I was able to console install it using the same trick as with the U10 (setting the output and input devices to TTYA). I'm going to post what I did here so anyone else with an UltraSPARC III/T1 based machine can get "something" going. I initially tried the 4.19 CD however after grabbing the new gpg key and doing an upgrade once the OS was installed it blew up a few minutes in. Namely the upgrade failed really early on and I was unable to sudo, and if I logged out I was unable to login as well. It didn't even try to let me enter in a password, it simply told me that it was incorrect. So what I did was I used the 5.16 CD, then when I was asked the tasksel question I used Ctrl-A and 2 to move to the shell, chrooted /target, and used busybox wget on the kernel you linked and installed it. This gave me a working system with the 4.19 kernel. For some reason or another though networking did not want to work on 4.19, so I mounted the CD, and installed the deb and udebs from there (and the kernel image from https://snapshot.debian.org/archive/debian-ports/20190622T024525Z/pool-sparc64/main/l/linux/), and networking finally worked. I'd install that too or the other udebs for the one you linked instead of just using the kernel. Now it's rock solid and I can do my weird SPARC experiments. Thanks so much. > On 10/13/2022 4:14 AM EDT Frank Scheiner wrote: > > > Hi Jake, > > On 13.10.22 07:13, j...@pawlicker.com j...@pawlicker.com wrote: > > I've also been able to confirm that this happens with Kernel 5.16 or at > > least similar bugs do such as Unable to handle kernel NULL pointer > > dereference, programs such as postgresql break dramatically, and another > > time SSH panicked the system with a kernel unaligned access. This > > happened during apt-get: > > [...] > Try with kernel 5.9.x, or maybe better already use 4.19.x on UltraSPARC > IIIi which works OK most of the time AFAIR. You can get those from > snapshot.debian.org (e.g. [1] or [2]). > > [1]: > http://snapshot.debian.org/archive/debian-ports/20190719T183113Z/pool-sparc64/main/l/linux/linux-image-4.19.0-5-sparc64_4.19.37-6_sparc64.deb > > [2]: > http://snapshot.debian.org/archive/debian-ports/20190719T183113Z/pool-sparc64/main/l/linux/linux-image-4.19.0-5-sparc64-smp_4.19.37-6_sparc64.deb > > ...but unsure if your system will run stable enough to successfully > finish the installation. Alternatively try to reinstall with an older > ISO and work from there: > > * with 5.9.0-4: > https://cdimage.debian.org/cdimage/ports/snapshots/2020-12-03/debian-10.0.0-sparc64-NETINST-1.iso > > * with 4.19.0-5: > https://cdimage.debian.org/cdimage/ports/snapshots/2019-06-26/debian-10.0-sparc64-NETINST-1.iso > > > > There seems to be a problem with UltraSPARC T1s and I strongly believe > this or another problem also affects UltraSPARC III(i)s. I have tested a > variety of processors here: > > https://lists.debian.org/debian-sparc/2021/12/msg4.html > > For more details on this/these issue(s) see: > > https://lists.debian.org/debian-sparc/2021/03/msg00045.html > > ...and: > > https://lists.debian.org/debian-sparc/2022/02/msg0.html > > Cheers, > Frank
Re: Sun Ultra 45: Kernel Panic (corrupted stack end detected inside scheduler) with 5.19
Hi Jake, On 13.10.22 07:13, j...@pawlicker.com j...@pawlicker.com wrote: I've also been able to confirm that this happens with Kernel 5.16 or at least similar bugs do such as Unable to handle kernel NULL pointer dereference, programs such as postgresql break dramatically, and another time SSH panicked the system with a kernel unaligned access. This happened during apt-get: [...] Try with kernel 5.9.x, or maybe better already use 4.19.x on UltraSPARC IIIi which works OK most of the time AFAIR. You can get those from snapshot.debian.org (e.g. [1] or [2]). [1]: http://snapshot.debian.org/archive/debian-ports/20190719T183113Z/pool-sparc64/main/l/linux/linux-image-4.19.0-5-sparc64_4.19.37-6_sparc64.deb [2]: http://snapshot.debian.org/archive/debian-ports/20190719T183113Z/pool-sparc64/main/l/linux/linux-image-4.19.0-5-sparc64-smp_4.19.37-6_sparc64.deb ...but unsure if your system will run stable enough to successfully finish the installation. Alternatively try to reinstall with an older ISO and work from there: * with 5.9.0-4: https://cdimage.debian.org/cdimage/ports/snapshots/2020-12-03/debian-10.0.0-sparc64-NETINST-1.iso * with 4.19.0-5: https://cdimage.debian.org/cdimage/ports/snapshots/2019-06-26/debian-10.0-sparc64-NETINST-1.iso There seems to be a problem with UltraSPARC T1s and I strongly believe this or another problem also affects UltraSPARC III(i)s. I have tested a variety of processors here: https://lists.debian.org/debian-sparc/2021/12/msg4.html For more details on this/these issue(s) see: https://lists.debian.org/debian-sparc/2021/03/msg00045.html ...and: https://lists.debian.org/debian-sparc/2022/02/msg0.html Cheers, Frank
Re: Sun Ultra 45: Kernel Panic (corrupted stack end detected inside scheduler) with 5.19
I've also been able to confirm that this happens with Kernel 5.16 or at least similar bugs do such as Unable to handle kernel NULL pointer dereference, programs such as postgresql break dramatically, and another time SSH panicked the system with a kernel unaligned access. This happened during apt-get: [ 1735.463205] Unable to handle kernel NULL pointer dereference [ 1735.543500] tsk->{mm,active_mm}->context = 0096 [ 1735.622697] tsk->{mm,active_mm}->pgd = fff207dfc000 [ 1735.697892] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 [ 1735.808123] Press Stop-A (L1-A) from sun keyboard or send break [ 1735.808123] twice on console to return to the boot prom [ 1735.808158] kernel BUG at kernel/cpu.c:1086! > On 10/12/2022 11:33 PM EDT j...@pawlicker.com j...@pawlicker.com > wrote: > > > On my Sun Ultra 45 with two CPUs, Debian does not boot a finished > installation using Kernel 5.19. On the 5.16 kernel included on the CD the OS > boots just fine if this is selected using GRUB. This also seems to be > intermittent, as first booting into 5.19 was stable after trying to use quiet > to get an error log after a 5.16 boot, but then rebooting afterwards gave me > a more verbose error: > > Booting `Debian GNU/Linux' > Loading Linux 5.19.0-2-sparc64-smp ... > Loading initial ramdisk ... > [ 0.684502] pci :05:1d.0: unsupported PM cap regs version (4) > [ 4.937387] BAD IRQ ack 0 > [ 5.436972] Kernel panic - not syncing: corrupted stack end detected inside > scheduler > [ 5.531706] CPU: 0 PID: 107 Comm: systemd-udevd Not tainted > 5.19.0-2-sparc64-smp #1 Debian 5.19.11-1 > [ 5.643306] Call Trace: > [ 5.672826] [<00cbe4e8>] dump_stack+0x8/0x18 > [ 5.732816] [<00cb7518>] panic+0xf0/0x360 > [ 10.067911] ---[ end Kernel panic - not syncing: corrupted stack end > detected inside scheduler ]--- > > Second boot: > > Loading Linux 5.19.0-2-sparc64-smp ... > Loading initial ramdisk ... > [ 0.681139] pci :05:1d.0: unsupported PM cap regs version (4) > [ 5.014440] Kernel panic - not syncing: corrupted stack end detected inside > scheduler > [ 5.016901] tg3 :07:04.1 eth1: Tigon3 [partno(BCM95715) rev 9001] > (PCIX:133MHz:64-bit) MAC address 00:14:4f:0f:db:ed > [ 5.016925] tg3 :07:04.1 eth1: attached PHY is 5714 (10/100/1000Base-T > Ethernet) (WireSpeed[1], EEE[0]) > [ 5.016933] tg3 :07:04.1 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] > TSOcap[1] > [ 5.016941] tg3 :07:04.1 eth1: dma_rwctrl[76148000] dma_mask[40-bit] > [ 5.546143] CPU: 0 PID: 117 Comm: systemd-udevd Not tainted > 5.19.0-2-sparc64-smp #1 Debian 5.19.11-1 > [ 5.675485] Call Trace: > [ 5.710701] [<00cbe4e8>] dump_stack+0x8/0x18 > [ 5.782158] [<00cb7518>] panic+0xf0/0x360 > [ 5.850408] [<00cc5698>] switch_to_pc+0x834/0x85c > [ 5.927055] [<00cc58e0>] __cond_resched+0x40/0x60 > [ 6.003715] [<006bc990>] kmem_cache_alloc_trace+0x430/0x580 > [ 6.090777] [<1001ab9c>] usb_control_msg+0x1c/0x120 [usbcore] > [ 6.180114] [<1000d480>] hub_power_on+0x60/0x180 [usbcore] > [ 6.266426] [<1000e0c8>] hub_activate+0x868/0xa00 [usbcore] > [ 6.354029] [<10015638>] hub_probe+0xeb8/0xf20 [usbcore] > [ 6.438467] [<1001fda8>] usb_probe_interface+0xe8/0x300 [usbcore] > [ 6.532373] [<009e8c48>] really_probe+0xc8/0x480 > [ 6.608355] [<009e9124>] __driver_probe_device+0x124/0x180 > [ 6.694905] [<009e91a8>] driver_probe_device+0x28/0xe0 > [ 6.777256] [<009e995c>] __device_attach_driver+0x9c/0x140 > [ 6.863759] [<009e6568>] bus_for_each_drv+0x68/0xc0 > [ 6.942884] [<009e94c0>] __device_attach+0xa0/0x200 > [ 7.022122] Press Stop-A (L1-A) from sun keyboard or send break > [ 7.022122] twice on console to return to the boot prom > [ 7.022158] kernel BUG at kernel/cpu.c:1092! > [ 7.022174] \|/ \|/ > [ 7.022174] "@'/ .. \`@" > [ 7.022174] /_| \__/ |_\ > [ 7.022174] \__U_/ > [ 7.022178] swapper/1(0): Kernel bad sw trap 5 [#1] > [ 7.022185] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19.0-2-sparc64-smp #1 > Debian 5.19.11-1 > [ 7.022195] TSTATE: 004411e01604 TPC: 00470074 TNPC: > 00470078 Y: 000a Not tainted > [ 7.022201] TPC: > [ 7.00] g0: 00ccc140 g1: 00ff2908 g2: 00ff2908 > g3: 02f6 > [ 7.05] g4: fff200261600 g5: fff37e8c4000 g6: fff2002a > g7: 000e > [ 7.09] o0: 00e2d220 o1: 0444 o2: 4000 > o3: 0001 > [ 7.022234] o4: 00018701a800 o5: 000e sp: fff2002a3481 > ret_pc: 0047006c > [ 7.022238] RPC: > [ 7.022244] l0: 1000 l1: 004411001603 l2: 0092979c > l3: 0400 > [ 7.022249] l4: l5: l6: > l7: 0008 > [ 7.022252] i0: 000e i1: fff2002a0008 i2: 4000 > i3: