RE: Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss
Hi Adrian, > This looks more like an issue with your bootloader. I haven't used SILO for a > long time, so I don't have a track what currently works and what not. Thanks for pointing that out! I was searching in the wrong direction... > Can you try booting the current ISO snapshot image? Yes, that image boots successfully on this machine, with a newer kernel, using GRUB instead of SILO. So the actual issue is that Debian kernels > 5.14.3-1~exp1 fail to boot with SILO 1.4.14+git20141019-5. I've also built the latest SILO from git [1] but it has the same issue: it boots Debian kernel 5.14.3-1~exp1 successfully, but later Debian kernels fail with ERROR: Last Trap: Fast Data Access MMU Miss. I had previously given up on migrating to GRUB, as I couldn't get it to work with RAID1 boot and root partitions on disks with a Sun disk label. I still can't get that working but I'll start a separate thread for that. As this SPARC T4-1 is able to boot from disks with GPT/EFI disk label, I've converted the disks to use a GPT/EFI label with a dedicated BIOS boot partition and installed GRUB using the instructions at [2]. It is working fine with RAID1 boot and root partitions and with newer Debian kernels. It would still be nice to know why newer Debian kernels fail to boot with SILO, but I don't know how to debug ERROR: Last Trap: Fast Data Access MMU Miss. This Debian bug report can be closed as resolved by migrating from SILO to GRUB. [1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/silo.git [2] https://github.com/esnowberg/grub2-sparc/wiki
Re: Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss
Hi Tom! On 1/31/22 11:47, Tom Turelinckx wrote: > I had previously given up on migrating to GRUB, as I couldn't get it to work > with RAID1 boot and root > partitions on disks with a Sun disk label. I still can't get that working but > I'll start a separate thread > for that. I suggest asking on the GRUB mailing list upstream. You will find more GRUB experts there. You could also CC one of the authors of the SPARC code in GRUB, Eric Snowberg. > As this SPARC T4-1 is able to boot from disks with GPT/EFI disk label, I've > converted the disks to use > a GPT/EFI label with a dedicated BIOS boot partition and installed GRUB using > the instructions at [2]. > It is working fine with RAID1 boot and root partitions and with newer Debian > kernels. debian-installer defaults to GPT partition tables on T4 and newer automatically. > It would still be nice to know why newer Debian kernels fail to boot with > SILO, but I don't know how > to debug ERROR: Last Trap: Fast Data Access MMU Miss. You would have to ask the SILO developers upstream. I think you can post on the sparclinux LKML for that. > This Debian bug report can be closed as resolved by migrating from SILO to > GRUB. Done. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss
At least on my old Netra T1, SILO has never believed in booting vmlinuz, only vmlinux, and faults similarly if you try. So if it just recently started faulting that way for you, perhaps any glue that knew to unpack vmlinuz into vmlinux isn't working? - Rich On Sun, Jan 23, 2022 at 1:30 PM John Paul Adrian Glaubitz < glaub...@physik.fu-berlin.de> wrote: > Hello Tom! > > On 1/23/22 17:39, Tom Turelinckx wrote: > > Boot device: disk0 File and args: > > SILO Version 1.4.14 > > boot: > > Allocated 64 Megs of memory at 0x4000 for kernel > > Uncompressing image... > > Loaded kernel version 5.14.6 > > Loading initial ramdisk (25723814 bytes at 0x2480 phys, 0x40C0 > virt)... > > ERROR: Last Trap: Fast Data Access MMU Miss > > This looks more like an issue with your bootloader. I haven't used SILO > for a > long time, so I don't have a track what currently works and what not. > > Can you try booting the current ISO snapshot image? [1] > > Adrian > > > [1] > https://cdimage.debian.org/cdimage/ports/snapshots/2021-10-20/debian-11.0.0-sparc64-NETINST-1.iso > > -- > .''`. John Paul Adrian Glaubitz > : :' : Debian Developer - glaub...@debian.org > `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de > `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 > >
Re: Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss
Hello Tom! On 1/23/22 17:39, Tom Turelinckx wrote: > Boot device: disk0 File and args: > SILO Version 1.4.14 > boot: > Allocated 64 Megs of memory at 0x4000 for kernel > Uncompressing image... > Loaded kernel version 5.14.6 > Loading initial ramdisk (25723814 bytes at 0x2480 phys, 0x40C0 > virt)... > ERROR: Last Trap: Fast Data Access MMU Miss This looks more like an issue with your bootloader. I haven't used SILO for a long time, so I don't have a track what currently works and what not. Can you try booting the current ISO snapshot image? [1] Adrian > [1] > https://cdimage.debian.org/cdimage/ports/snapshots/2021-10-20/debian-11.0.0-sparc64-NETINST-1.iso -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss
Package: src:linux Version: 5.14.6-2 Severity: important X-Debbugs-Cc: debian-sparc@lists.debian.org Dear Maintainer, Debian kernels > 5.14.3-1~exp1 consistently fail to boot on SPARC T4-1: SPARC T4-1, No Keyboard Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.36.2, 31.5000 GB memory available, Serial #108045182. Ethernet address 0:10:e0:70:a3:7e, Host ID: 8670a37e. Boot device: disk0 File and args: SILO Version 1.4.14 boot: Allocated 64 Megs of memory at 0x4000 for kernel Uncompressing image... Loaded kernel version 5.14.6 Loading initial ramdisk (25723814 bytes at 0x2480 phys, 0x40C0 virt)... ERROR: Last Trap: Fast Data Access MMU Miss Debian kernels 5.14.3-1~exp1 and earlier boot and run successfully on this system. I have tried the sparc64-smp packages built by buildd landau for these versions: 5.14.6-2, 5.14.6-3, 5.14.9-2, 5.15.5-2, 5.15.15-1, 5.16~rc8-1~exp1 They all consistently fail to boot with the same error. I have built the Debian src pkg version 5.14.6-1 using pbuilder with a sid basetgz. It consistently fails to boot with the same error. I've then tried to bisect using the DebianKernel/GitBisect instructions on the Debian wiki, but it turns out that kernels built from git (tag v5.14.3, tag v5.14.6, and ~9 bisects in between) using make bindeb-pkg all do boot successfully on this system. I've tried checking out tag v5.14.6 from git, then applying all the patches from debian/patches in the 5.14.6-1 src pkg and building using make bindeb-pkg. The resulting kernel boots successfully. I've tried extracting the 5.14.6-1 src pkg using dpkg-source -x, then building using make bindeb-pkg and the resulting kernel boots succesfully. But if I build using dpkg-buildpackage like pbuilder does, then the resulting -sparc64-smp package fails to boot with the above error. When building, I have used each time a clean sid changeroot. When using make bindeb-pkg I have copied the config installed in /boot by the (non-booting) 5.14.6-1 Debian package then done make olddefconfig. When using make bindeb-pkg I had to manually disable stringop-overread warnings in Makefile to avoid build failure on arch/sparc/kernel/mdesc.c with v5.14.6 (fixed in later versions by [1]). When building using bindeb-pkg the resulting kernel is compressed; when using dpkg-buildpackage the resulting kernel is uncompressed. I have tried both uncompressing the compressed kernel and compressing the uncompressed kernel, as silo supports both. It doesn't affect the results. Uncompressed, the Debian kernel is ~17MB while the standard kernel is ~13MB. I'm not sure why this difference is there. On Debian salsa's kernel-team/linux I have combed through all the commits between tags debian/5.14.3-1_exp1 and debian/5.14.6-1, but none of them seem relevant to this issue. I have checked the upstream changelog between v5.14.3 and v5.14.6, but nothing sparc-specific has changed. According to the buildd logs, landau is running kernel 5.15.5-2. But I think this is a SPARC-T5 so not a T4, and I think it's running inside an LDOM which is not the case on my T4, so it may not be comparable. I've also tried to get more information about the failure, but I don't know how to do that. I've tried to get into the initramfs environment by using break=premount/modules/top, but the failure happens before those stages. Measuring the elapsed time after Loading initial ramdisk it would seem the error message ERROR: Last Trap: Fast Data Access MMU Miss appears when normally the first kernel output would appear. I've tried to look into what the Debian src pkg's debian/* scripts do, exactly, but this is rather complicated and I have limited experience with it. Any suggestions what else I could try? [1]: https://github.com/gregkh/linux/commit/fc7c028dcdbfe981bca75d2a7b95f363eb691ef3 -- Package-specific info: ** Kernel log: boot messages should be attached ** Model information cpu : UltraSparc T4 (Niagara4) fpu : UltraSparc T4 integrated FPU pmu : niagara4 prom: OBP 4.36.2 2014/10/24 08:13 type: sun4v ** Network interface configuration: *** /etc/network/interfaces: source /etc/network/interfaces.d/* auto lo iface lo inet loopback auto br0 iface br0 inet static bridge_ports enp15s0f0 bridge_fd 0 address x.x.x.x netmask x.x.x.x gateway x.x.x.x iface enp15s0f0 inet manual ** PCI devices: 00:01.0 PCI bridge [0604]: Oracle/SUN Device [108e:8186] (rev 01) (prog-if 00 [Normal decode]) Device tree node: /sys/firmware/devicetree/base/pci@400/pci@1 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: 00:02.0 PCI bridge [0604]: Oracle/SUN Device