Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread John Paul Adrian Glaubitz
On 3/9/21 11:20 PM, John Paul Adrian Glaubitz wrote:
>> Which kernel version will have this bug (which one?) fixed, 5.11.x? I
>> can also check with one of my UltraSPARC IIIi powered systems, too, next
>> week.
> 
> I have not uploaded that kernel yet, I have it built locally, PR here [1].

The patch is now in Linus' tree so it will be part of 5.12 [1].

Adrian

> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5e8b80d352ec999d2bba3ea584f541c83f4ca3f

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread John Paul Adrian Glaubitz
On 3/9/21 10:18 PM, Frank Scheiner wrote:
>> The oldest buildd we are running is a T5120 and that's a T2.
> 
> And these don't show the problems Riccardo's T1 powered T2000 has?

No, the machine runs stable.

>> We have an older UltraSPARC IIIi that has issues with newer kernels, but
>> usually only after longer operation and the issue might be related to the
>> bug that was just fixed recently by Rob Gardner.
> 
> Which kernel version will have this bug (which one?) fixed, 5.11.x? I
> can also check with one of my UltraSPARC IIIi powered systems, too, next
> week.

I have not uploaded that kernel yet, I have it built locally, PR here [1].

Adrian

> [1] https://salsa.debian.org/kernel-team/linux/-/merge_requests/339

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread Frank Scheiner

On 09.03.21 22:09, John Paul Adrian Glaubitz wrote:

On 3/9/21 9:38 PM, Frank Scheiner wrote:

I have a T1000 with which I could try to reproduce Riccardo's issues.
Hardware wise they should be pretty similar. As the T1000 doesn't have a
CDROM, I'll try to netboot a few newer kernels and report my findings.
Will take me until next week though, as the machine is in (cold) storage
now.

@Adrian:
Aren't there some build servers using UltraSPARC T2 or T2+? Do they run
with the latest kernels?


The oldest buildd we are running is a T5120 and that's a T2.


And these don't show the problems Riccardo's T1 powered T2000 has?


We have an older UltraSPARC IIIi that has issues with newer kernels, but
usually only after longer operation and the issue might be related to the
bug that was just fixed recently by Rob Gardner.


Which kernel version will have this bug (which one?) fixed, 5.11.x? I
can also check with one of my UltraSPARC IIIi powered systems, too, next
week.

Cheers,
Frank



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread John Paul Adrian Glaubitz
On 3/9/21 9:38 PM, Frank Scheiner wrote:
> I have a T1000 with which I could try to reproduce Riccardo's issues.
> Hardware wise they should be pretty similar. As the T1000 doesn't have a
> CDROM, I'll try to netboot a few newer kernels and report my findings.
> Will take me until next week though, as the machine is in (cold) storage
> now.
> 
> @Adrian:
> Aren't there some build servers using UltraSPARC T2 or T2+? Do they run
> with the latest kernels?

The oldest buildd we are running is a T5120 and that's a T2.

We have an older UltraSPARC IIIi that has issues with newer kernels, but
usually only after longer operation and the issue might be related to the
bug that was just fixed recently by Rob Gardner.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread Frank Scheiner

Hi guys,

On 09.03.21 18:31, John Paul Adrian Glaubitz wrote:

Hi!

On 3/9/21 6:26 PM, Riccardo Mottola wrote:

John Paul Adrian Glaubitz wrote:

while I was able to "install" correctly using a slightly older ISO, I get not a 
bootable
system. The kernel appears to crash very early during boot.

I think this is more likely a hardware issue. We haven't seen any machines 
crashing that
early. Please make sure the RAM modules in this machine are working properly.


I don't think so... I think it is a Kernel issue, since with kernel
5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux

the machine is performing fine with network, disk and compiler usage on all 32 
CPUs.


Then you need to bisect the kernel as I don't have any means to reproduce the 
issue.


I have a T1000 with which I could try to reproduce Riccardo's issues.
Hardware wise they should be pretty similar. As the T1000 doesn't have a
CDROM, I'll try to netboot a few newer kernels and report my findings.
Will take me until next week though, as the machine is in (cold) storage
now.

@Adrian:
Aren't there some build servers using UltraSPARC T2 or T2+? Do they run
with the latest kernels?

Cheers,
Frank



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread John Paul Adrian Glaubitz
Hi!

On 3/9/21 6:26 PM, Riccardo Mottola wrote:
> John Paul Adrian Glaubitz wrote:
>>> while I was able to "install" correctly using a slightly older ISO, I get 
>>> not a bootable
>>> system. The kernel appears to crash very early during boot.
>> I think this is more likely a hardware issue. We haven't seen any machines 
>> crashing that
>> early. Please make sure the RAM modules in this machine are working properly.
> 
> I don't think so... I think it is a Kernel issue, since with kernel
> 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux
> 
> the machine is performing fine with network, disk and compiler usage on all 
> 32 CPUs.

Then you need to bisect the kernel as I don't have any means to reproduce the 
issue.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread Riccardo Mottola

Hi,

John Paul Adrian Glaubitz wrote:

while I was able to "install" correctly using a slightly older ISO, I get not a 
bootable
system. The kernel appears to crash very early during boot.

I think this is more likely a hardware issue. We haven't seen any machines 
crashing that
early. Please make sure the RAM modules in this machine are working properly.


I don't think so... I think it is a Kernel issue, since with kernel
5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) sparc64 GNU/Linux

the machine is performing fine with network, disk and compiler usage on 
all 32 CPUs. I tried heavy load of parallel compilations, using git on 
large repositories as well as using remote X applications at the same 
time, a combination I know tends to show issues on systems, without 
problems! Not a simgle error in syslog.

Machine powerup-and self-tests are fine too.

If I remember, there is a repository of various pre-compiled kernel 
versions: maybe there are some releases between the two kernels I can 
try and do some easy rough bisecting.


so I'd say RAM, CPUs, Disk and Ethernet are working quite fine

Riccardo



Re: getting a working install ISOs on a T2000

2021-03-09 Thread Riccardo Mottola

Hi Adrian

the world is small between SPARC and PPC :)

John Paul Adrian Glaubitz wrote:

2020-11-16 -> this one worked! (but system is unbootable due to crash, of that 
in a second mail)

This sounds like a hardware problem. The newer images should all work on 
sparc64 with a few
images that don't.

Can you make sure the memory is ok, i.e. by installing Solaris?


The system had a previously working Debian install, just old.

I also found luckliy the older kernel probably from the CD:


Linux narya 5.9.0-2-sparc64-smp #1 SMP Debian 5.9.6-1 (2020-11-08) 
sparc64 GNU/Linux


and with this one the system appears running stable in SMP (32 CPUs!). I 
did some massive compilation on all 32 CPUs, stressed a bit the system 
and it appears working.


I'd say the system is quite stable, no memory errors.


Riccardo



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread John Paul Adrian Glaubitz
Hello Riccardo!

On 3/9/21 1:23 PM, Riccardo Mottola wrote:
> while I was able to "install" correctly using a slightly older ISO, I get not 
> a bootable
> system. The kernel appears to crash very early during boot.

I think this is more likely a hardware issue. We haven't seen any machines 
crashing that
early. Please make sure the RAM modules in this machine are working properly.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-09 Thread Riccardo Mottola

Hi all,

while I was able to "install" correctly using a slightly older ISO, I 
get not a bootable system. The kernel appears to crash very early during 
boot.


Anybody else has this issue?

  Booting `Debian GNU/Linux'

Loading Linux 5.10.0-4-sparc64-smp ...
Loading initial ramdisk ...

[   26.900156] sd 2:1:0:0: [sda] No Caching mode page found
[   26.900336] sd 2:1:0:0: [sda] Assuming drive cache: write through
/dev/sda2: clean, 31420/4276224 files, 659826/17089844 blocks
[   30.362550] Unable to handle kernel NULL pointer dereference
[   30.362722] tsk->{mm,active_mm}->context = 00ab
[   30.362818] tsk->{mm,active_mm}->pgd = 8f258000
[   30.363585] Kernel panic - not syncing: Aiee, killing interrupt handler!
[   30.363740] OOPS: Bogus kernel PC [07c0] in fault handler
[   30.363747] OOPS: RPC [0042c614]
[   30.363766] OOPS: RPC 
[   30.363773] OOPS: Fault was to vaddr[7c0]
[   30.363787] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G  D E 
5.10.0-4-sparc64-smp #1 Debian 5.10.19-1

[   30.363792] Call Trace:
[   30.363808] [<00c5394c>] do_sparc64_fault+0xa4c/0xa80
[   30.363829] [<00407714>] sparc64_realfault_common+0x10/0x20
[   30.363839] [<07c0>] 0x7c0
[   30.363852] [<00c519a8>] default_idle_call+0x48/0x140
[   30.363865] [<004a7b40>] do_idle+0xe0/0x1a0
[   30.363878] [<004a7e5c>] cpu_startup_entry+0x1c/0x80
[   30.363899] [<00c4b278>] rest_init+0xb8/0xc8
[   30.363915] [<00fe26a4>] arch_call_rest_init+0xc/0x1c
[   30.363930] [<00fe2d40>] start_kernel+0x628/0x640
[   30.363946] [<00fe532c>] start_early_boot+0x2a0/0x2b0
[   30.363962] [<00c4b1a0>] tlb_fixup_done+0x4c/0x6c
[   30.363972] [<0016a60c>] 0x16a60c
[   30.363978] Unable to handle kernel NULL pointer dereference
[   30.363984] tsk->{mm,active_mm}->context = 00b5
[   30.363990] tsk->{mm,active_mm}->pgd = 800014594000
[   30.363997]   \|/  \|/
[   30.363997]   "@'/ .. \`@"
[   30.363997]   /_| \__/ |_\
[   30.363997]  \__U_/
[   30.364004] swapper/0(0): Oops [#2]
[   30.364017] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G  D E 
5.10.0-4-sparc64-smp #1 Debian 5.10.19-1
[   30.364027] TSTATE: 004480001600 TPC: 07c0 TNPC: 
07c4 Y: Tainted: G  D

[   30.364036] TPC: <0x7c0>
[   30.364044] g0: 40004059 g1: 0016 g2: 
f020 g3: fff78000
[   30.364053] g4: 5a20 g5: 8003fd79c000 g6: 
00e8 g7: 43ba
[   30.364061] o0: 07c0 o1:  o2: 
 o3: 
[   30.364070] o4:  o5:  sp: 
00e831a1 ret_pc: 0042c614

[   30.364084] RPC: 
[   30.364093] l0: 00f8b7d8 l1: 4000407c l2: 
40004059 l3: 0040
[   30.364102] l4: f027e7f8 l5: 40004128 l6: 
000ed000 l7: f025cfd8
[   30.364110] i0: 000e i1: 00e80008 i2: 
4000 i3: 07c0
[   30.364118] i4: fef42ff8 i5: fef41800 i6: 
00e83251 i7: 00c519a8

[   30.364131] I7: 
[   30.364137] Call Trace:
[   30.364150] [<00c519a8>] default_idle_call+0x48/0x140
[   30.364162] [<004a7b40>] do_idle+0xe0/0x1a0
[   30.364175] [<004a7e5c>] cpu_startup_entry+0x1c/0x80
[   30.364191] [<00c4b278>] rest_init+0xb8/0xc8
[   30.364207] [<00fe26a4>] arch_call_rest_init+0xc/0x1c
[   30.364221] [<00fe2d40>] start_kernel+0x628/0x640
[   30.364236] [<00fe532c>] start_early_boot+0x2a0/0x2b0
[   30.364252] [<00c4b1a0>] tlb_fixup_done+0x4c/0x6c
[   30.364262] [<0016a60c>] 0x16a60c
[   30.364276] Caller[00c519a8]: default_idle_call+0x48/0x140
[   30.364288] Caller[004a7b40]: do_idle+0xe0/0x1a0
[   30.364300] Caller[004a7e5c]: cpu_startup_entry+0x1c/0x80
[   30.364315] Caller[00c4b278]: rest_init+0xb8/0xc8
[   30.364330] Caller[00fe26a4]: arch_call_rest_init+0xc/0x1c
[   30.364343] Caller[00fe2d40]: start_kernel+0x628/0x640
[   30.364358] Caller[00fe532c]: start_early_boot+0x2a0/0x2b0
[   30.364373] Caller[00c4b1a0]: tlb_fixup_done+0x4c/0x6c
[   30.364383] Caller[0016a60c]: 0x16a60c
[   30.364387] Instruction DUMP:
[   30.364397] Unable to handle kernel NULL pointer dereference
[   30.364404] tsk->{mm,active_mm}->context = 00b5
[   30.364409] tsk->{mm,active_mm}->pgd = 800014594000
[   30.364416]   \|/  \|/
[   30.364416]   "@'/ .. \`@"
[   30.364416]   /_| \__/ |_\
[   30.364416]  \__U_/
[   30.364422] swapper/0(0): Oops [#3]
[   30.364436] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G  D E 
5.10.0-4-sparc64-smp #1 Debian 5.10.19-1
[   30.364447] TSTATE: 

Re: getting a working install ISOs on a T2000

2021-03-09 Thread John Paul Adrian Glaubitz
Hello!

On 3/9/21 12:28 PM, Riccardo Mottola wrote:
> I tried hard installing Debian/sparc64, it was not easy at all and haven't 
> concluded.
> 
> The T2000 I started from had Linux already installed, with an older 4.x 
> series kernel,
> I'd guess not updated since 3 years. It was working and was configured with 
> SILO. I tried
> updating but the boot partition was too small to fit old and new kernels, 
> also the (partially?)
> installed 5.x kernel on reboot entered in an endless loop of crashes I could 
> not stop nor log.
> 
> Unfortunately the fresh install I did with one of the working ISOs suffers 
> from the same crashes!
> 
> 
> I then went on with snapshots, going back from the latest I found...
> 
> 2020-11-16 -> this one worked! (but system is unbootable due to crash, of 
> that in a second mail)

This sounds like a hardware problem. The newer images should all work on 
sparc64 with a few
images that don't.

Can you make sure the memory is ok, i.e. by installing Solaris?

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



getting a working install ISOs on a T2000

2021-03-09 Thread Riccardo Mottola

Hi,

I tried hard installing Debian/sparc64, it was not easy at all and 
haven't concluded.


The T2000 I started from had Linux already installed, with an older 4.x 
series kernel, I'd guess not updated since 3 years. It was working and 
was configured with SILO. I tried updating but the boot partition was 
too small to fit old and new kernels, also the (partially?) installed 
5.x kernel on reboot entered in an endless loop of crashes I could not 
stop nor log.


Unfortunately the fresh install I did with one of the working ISOs 
suffers from the same crashes!



I then went on with snapshots, going back from the latest I found...

2020-11-16 -> this one worked! (but system is unbootable due to crash, 
of that in a second mail)


Later, all these die with in the same manner:
-- with 2020-12-03
-- with 2021-01-03
-- with 2021-02-02

Mar  8 23:43:22 main-menu[272]: WARNING **: Menu item 'localechooser' 
failed.
Mar  8 23:43:22 main-menu[279]: /var/lib/dpkg/status: No such file or 
directory
Mar  8 23:43:22 main-menu[279]: WARNING **: Configuring 
'libdebian-installer4-u
Mar  8 23:43:22 main-menu[279]: WARNING **: Menu item 'localechooser' 
failed.

Mar  8 23:43:22 kernel: [   33.630772] random: crng init done
Mar  8 23:43:41 main-menu[279]: INFO: Modifying debconf priority limit 
from 'hi

Mar  8 23:43:41 debconf: Setting debconf/priority to medium
Mar  8 23:43:41 kernel: [   52.813059] main-menu[279]: segfault at 8 ip 
010
Mar  8 23:50:59 init: process '/sbin/debian-installer' (pid 238) exited. 
Schedu
Mar  8 23:50:59 init: starting pid 293, tty '/dev/ttyHV0': 
'/sbin/debian-instal


<>

Mar  8 23:51:00 debconf: Setting debconf/language to en
Mar  8 23:51:00 kernel: [  491.248772] main-menu[312]: segfault at 8 ip 
010]