Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
On 3/9/21 11:20 PM, John Paul Adrian Glaubitz wrote: >> Which kernel version will have this bug (which one?) fixed, 5.11.x? I >> can also check with one of my UltraSPARC IIIi powered systems, too, next >> week. > > I have not uploaded that kernel yet, I have it built locally, PR here [1]. The patch is now in Linus' tree so it will be part of 5.12 [1]. Adrian > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5e8b80d352ec999d2bba3ea584f541c83f4ca3f -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
On 3/9/21 10:18 PM, Frank Scheiner wrote: >> The oldest buildd we are running is a T5120 and that's a T2. > > And these don't show the problems Riccardo's T1 powered T2000 has? No, the machine runs stable. >> We have an older UltraSPARC IIIi that has issues with newer kernels, but >> usually only after longer operation and the issue might be related to the >> bug that was just fixed recently by Rob Gardner. > > Which kernel version will have this bug (which one?) fixed, 5.11.x? I > can also check with one of my UltraSPARC IIIi powered systems, too, next > week. I have not uploaded that kernel yet, I have it built locally, PR here [1]. Adrian > [1] https://salsa.debian.org/kernel-team/linux/-/merge_requests/339 -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
On 09.03.21 22:09, John Paul Adrian Glaubitz wrote: On 3/9/21 9:38 PM, Frank Scheiner wrote: I have a T1000 with which I could try to reproduce Riccardo's issues. Hardware wise they should be pretty similar. As the T1000 doesn't have a CDROM, I'll try to netboot a few newer kernels and report my findings. Will take me until next week though, as the machine is in (cold) storage now. @Adrian: Aren't there some build servers using UltraSPARC T2 or T2+? Do they run with the latest kernels? The oldest buildd we are running is a T5120 and that's a T2. And these don't show the problems Riccardo's T1 powered T2000 has? We have an older UltraSPARC IIIi that has issues with newer kernels, but usually only after longer operation and the issue might be related to the bug that was just fixed recently by Rob Gardner. Which kernel version will have this bug (which one?) fixed, 5.11.x? I can also check with one of my UltraSPARC IIIi powered systems, too, next week. Cheers, Frank
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
On 3/9/21 9:38 PM, Frank Scheiner wrote: > I have a T1000 with which I could try to reproduce Riccardo's issues. > Hardware wise they should be pretty similar. As the T1000 doesn't have a > CDROM, I'll try to netboot a few newer kernels and report my findings. > Will take me until next week though, as the machine is in (cold) storage > now. > > @Adrian: > Aren't there some build servers using UltraSPARC T2 or T2+? Do they run > with the latest kernels? The oldest buildd we are running is a T5120 and that's a T2. We have an older UltraSPARC IIIi that has issues with newer kernels, but usually only after longer operation and the issue might be related to the bug that was just fixed recently by Rob Gardner. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
Hi guys, On 09.03.21 18:31, John Paul Adrian Glaubitz wrote: Hi! On 3/9/21 6:26 PM, Riccardo Mottola wrote: John Paul Adrian Glaubitz wrote: while I was able to "install" correctly using a slightly older ISO, I get not a bootable system. The kernel appears to crash very early during boot. I think this is more likely a hardware issue. We haven't seen any machines crashing that early. Please make sure the RAM modules in this machine are working properly. I don't think so... I think it is a Kernel issue, since with kernel 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux the machine is performing fine with network, disk and compiler usage on all 32 CPUs. Then you need to bisect the kernel as I don't have any means to reproduce the issue. I have a T1000 with which I could try to reproduce Riccardo's issues. Hardware wise they should be pretty similar. As the T1000 doesn't have a CDROM, I'll try to netboot a few newer kernels and report my findings. Will take me until next week though, as the machine is in (cold) storage now. @Adrian: Aren't there some build servers using UltraSPARC T2 or T2+? Do they run with the latest kernels? Cheers, Frank
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
Hi! On 3/9/21 6:26 PM, Riccardo Mottola wrote: > John Paul Adrian Glaubitz wrote: >>> while I was able to "install" correctly using a slightly older ISO, I get >>> not a bootable >>> system. The kernel appears to crash very early during boot. >> I think this is more likely a hardware issue. We haven't seen any machines >> crashing that >> early. Please make sure the RAM modules in this machine are working properly. > > I don't think so... I think it is a Kernel issue, since with kernel > 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux > > the machine is performing fine with network, disk and compiler usage on all > 32 CPUs. Then you need to bisect the kernel as I don't have any means to reproduce the issue. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
Hi, John Paul Adrian Glaubitz wrote: while I was able to "install" correctly using a slightly older ISO, I get not a bootable system. The kernel appears to crash very early during boot. I think this is more likely a hardware issue. We haven't seen any machines crashing that early. Please make sure the RAM modules in this machine are working properly. I don't think so... I think it is a Kernel issue, since with kernel 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux the machine is performing fine with network, disk and compiler usage on all 32 CPUs. I tried heavy load of parallel compilations, using git on large repositories as well as using remote X applications at the same time, a combination I know tends to show issues on systems, without problems! Not a simgle error in syslog. Machine powerup-and self-tests are fine too. If I remember, there is a repository of various pre-compiled kernel versions: maybe there are some releases between the two kernels I can try and do some easy rough bisecting. so I'd say RAM, CPUs, Disk and Ethernet are working quite fine Riccardo
Re: getting a working install ISOs on a T2000
Hi Adrian the world is small between SPARC and PPC :) John Paul Adrian Glaubitz wrote: 2020-11-16 -> this one worked! (but system is unbootable due to crash, of that in a second mail) This sounds like a hardware problem. The newer images should all work on sparc64 with a few images that don't. Can you make sure the memory is ok, i.e. by installing Solaris? The system had a previously working Debian install, just old. I also found luckliy the older kernel probably from the CD: Linux narya 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux and with this one the system appears running stable in SMP (32 CPUs!). I did some massive compilation on all 32 CPUs, stressed a bit the system and it appears working. I'd say the system is quite stable, no memory errors. Riccardo
Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
Hello Riccardo! On 3/9/21 1:23 PM, Riccardo Mottola wrote: > while I was able to "install" correctly using a slightly older ISO, I get not > a bootable > system. The kernel appears to crash very early during boot. I think this is more likely a hardware issue. We haven't seen any machines crashing that early. Please make sure the RAM modules in this machine are working properly. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000
Hi all, while I was able to "install" correctly using a slightly older ISO, I get not a bootable system. The kernel appears to crash very early during boot. Anybody else has this issue? Booting `Debian GNU/Linux' Loading Linux 5.10.0-4-sparc64-smp ... Loading initial ramdisk ... [ 26.900156] sd 2:1:0:0: [sda] No Caching mode page found [ 26.900336] sd 2:1:0:0: [sda] Assuming drive cache: write through /dev/sda2: clean, 31420/4276224 files, 659826/17089844 blocks [ 30.362550] Unable to handle kernel NULL pointer dereference [ 30.362722] tsk->{mm,active_mm}->context = 00ab [ 30.362818] tsk->{mm,active_mm}->pgd = 8f258000 [ 30.363585] Kernel panic - not syncing: Aiee, killing interrupt handler! [ 30.363740] OOPS: Bogus kernel PC [07c0] in fault handler [ 30.363747] OOPS: RPC [0042c614] [ 30.363766] OOPS: RPC [ 30.363773] OOPS: Fault was to vaddr[7c0] [ 30.363787] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D E 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 [ 30.363792] Call Trace: [ 30.363808] [<00c5394c>] do_sparc64_fault+0xa4c/0xa80 [ 30.363829] [<00407714>] sparc64_realfault_common+0x10/0x20 [ 30.363839] [<07c0>] 0x7c0 [ 30.363852] [<00c519a8>] default_idle_call+0x48/0x140 [ 30.363865] [<004a7b40>] do_idle+0xe0/0x1a0 [ 30.363878] [<004a7e5c>] cpu_startup_entry+0x1c/0x80 [ 30.363899] [<00c4b278>] rest_init+0xb8/0xc8 [ 30.363915] [<00fe26a4>] arch_call_rest_init+0xc/0x1c [ 30.363930] [<00fe2d40>] start_kernel+0x628/0x640 [ 30.363946] [<00fe532c>] start_early_boot+0x2a0/0x2b0 [ 30.363962] [<00c4b1a0>] tlb_fixup_done+0x4c/0x6c [ 30.363972] [<0016a60c>] 0x16a60c [ 30.363978] Unable to handle kernel NULL pointer dereference [ 30.363984] tsk->{mm,active_mm}->context = 00b5 [ 30.363990] tsk->{mm,active_mm}->pgd = 800014594000 [ 30.363997] \|/ \|/ [ 30.363997] "@'/ .. \`@" [ 30.363997] /_| \__/ |_\ [ 30.363997] \__U_/ [ 30.364004] swapper/0(0): Oops [#2] [ 30.364017] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D E 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 [ 30.364027] TSTATE: 004480001600 TPC: 07c0 TNPC: 07c4 Y: Tainted: G D [ 30.364036] TPC: <0x7c0> [ 30.364044] g0: 40004059 g1: 0016 g2: f020 g3: fff78000 [ 30.364053] g4: 5a20 g5: 8003fd79c000 g6: 00e8 g7: 43ba [ 30.364061] o0: 07c0 o1: o2: o3: [ 30.364070] o4: o5: sp: 00e831a1 ret_pc: 0042c614 [ 30.364084] RPC: [ 30.364093] l0: 00f8b7d8 l1: 4000407c l2: 40004059 l3: 0040 [ 30.364102] l4: f027e7f8 l5: 40004128 l6: 000ed000 l7: f025cfd8 [ 30.364110] i0: 000e i1: 00e80008 i2: 4000 i3: 07c0 [ 30.364118] i4: fef42ff8 i5: fef41800 i6: 00e83251 i7: 00c519a8 [ 30.364131] I7: [ 30.364137] Call Trace: [ 30.364150] [<00c519a8>] default_idle_call+0x48/0x140 [ 30.364162] [<004a7b40>] do_idle+0xe0/0x1a0 [ 30.364175] [<004a7e5c>] cpu_startup_entry+0x1c/0x80 [ 30.364191] [<00c4b278>] rest_init+0xb8/0xc8 [ 30.364207] [<00fe26a4>] arch_call_rest_init+0xc/0x1c [ 30.364221] [<00fe2d40>] start_kernel+0x628/0x640 [ 30.364236] [<00fe532c>] start_early_boot+0x2a0/0x2b0 [ 30.364252] [<00c4b1a0>] tlb_fixup_done+0x4c/0x6c [ 30.364262] [<0016a60c>] 0x16a60c [ 30.364276] Caller[00c519a8]: default_idle_call+0x48/0x140 [ 30.364288] Caller[004a7b40]: do_idle+0xe0/0x1a0 [ 30.364300] Caller[004a7e5c]: cpu_startup_entry+0x1c/0x80 [ 30.364315] Caller[00c4b278]: rest_init+0xb8/0xc8 [ 30.364330] Caller[00fe26a4]: arch_call_rest_init+0xc/0x1c [ 30.364343] Caller[00fe2d40]: start_kernel+0x628/0x640 [ 30.364358] Caller[00fe532c]: start_early_boot+0x2a0/0x2b0 [ 30.364373] Caller[00c4b1a0]: tlb_fixup_done+0x4c/0x6c [ 30.364383] Caller[0016a60c]: 0x16a60c [ 30.364387] Instruction DUMP: [ 30.364397] Unable to handle kernel NULL pointer dereference [ 30.364404] tsk->{mm,active_mm}->context = 00b5 [ 30.364409] tsk->{mm,active_mm}->pgd = 800014594000 [ 30.364416] \|/ \|/ [ 30.364416] "@'/ .. \`@" [ 30.364416] /_| \__/ |_\ [ 30.364416] \__U_/ [ 30.364422] swapper/0(0): Oops [#3] [ 30.364436] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D E 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 [ 30.364447] TSTATE:
Re: getting a working install ISOs on a T2000
Hello! On 3/9/21 12:28 PM, Riccardo Mottola wrote: > I tried hard installing Debian/sparc64, it was not easy at all and haven't > concluded. > > The T2000 I started from had Linux already installed, with an older 4.x > series kernel, > I'd guess not updated since 3 years. It was working and was configured with > SILO. I tried > updating but the boot partition was too small to fit old and new kernels, > also the (partially?) > installed 5.x kernel on reboot entered in an endless loop of crashes I could > not stop nor log. > > Unfortunately the fresh install I did with one of the working ISOs suffers > from the same crashes! > > > I then went on with snapshots, going back from the latest I found... > > 2020-11-16 -> this one worked! (but system is unbootable due to crash, of > that in a second mail) This sounds like a hardware problem. The newer images should all work on sparc64 with a few images that don't. Can you make sure the memory is ok, i.e. by installing Solaris? Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
getting a working install ISOs on a T2000
Hi, I tried hard installing Debian/sparc64, it was not easy at all and haven't concluded. The T2000 I started from had Linux already installed, with an older 4.x series kernel, I'd guess not updated since 3 years. It was working and was configured with SILO. I tried updating but the boot partition was too small to fit old and new kernels, also the (partially?) installed 5.x kernel on reboot entered in an endless loop of crashes I could not stop nor log. Unfortunately the fresh install I did with one of the working ISOs suffers from the same crashes! I then went on with snapshots, going back from the latest I found... 2020-11-16 -> this one worked! (but system is unbootable due to crash, of that in a second mail) Later, all these die with in the same manner: -- with 2020-12-03 -- with 2021-01-03 -- with 2021-02-02 Mar 8 23:43:22 main-menu[272]: WARNING **: Menu item 'localechooser' failed. Mar 8 23:43:22 main-menu[279]: /var/lib/dpkg/status: No such file or directory Mar 8 23:43:22 main-menu[279]: WARNING **: Configuring 'libdebian-installer4-u Mar 8 23:43:22 main-menu[279]: WARNING **: Menu item 'localechooser' failed. Mar 8 23:43:22 kernel: [ 33.630772] random: crng init done Mar 8 23:43:41 main-menu[279]: INFO: Modifying debconf priority limit from 'hi Mar 8 23:43:41 debconf: Setting debconf/priority to medium Mar 8 23:43:41 kernel: [ 52.813059] main-menu[279]: segfault at 8 ip 010 Mar 8 23:50:59 init: process '/sbin/debian-installer' (pid 238) exited. Schedu Mar 8 23:50:59 init: starting pid 293, tty '/dev/ttyHV0': '/sbin/debian-instal <> Mar 8 23:51:00 debconf: Setting debconf/language to en Mar 8 23:51:00 kernel: [ 491.248772] main-menu[312]: segfault at 8 ip 010]