Hi guys,
On 11.12.21 18:59, John Paul Adrian Glaubitz wrote:
On 12/11/21 18:40, Riccardo Mottola wrote:
I remember you bisected about the breaking commits. Has there been any progress?
A better place where to report this issue other than this mailing list?
The proper place is to send an email to the author of the breaking commit and
CC the sparclinux Linux kernel mailing list. Most kernel developers don't read
the debian-sparc mailing list.
We actually did discuss this in late March 2021 starting here:
https://lists.debian.org/debian-sparc/2021/03/msg00045.html
...with Christoph Hellwig and CCed to sparcli...@vger.kernel.org and
this list, but no solution back then.
****
Back in October I did some testing on various UltraSPARC machines to
sort out which processor( generation)s are affected but didn't found the
time to make something out of it apart from notes and a conclusion.
I couldn't get my Ultra 80 to netboot, so no result for UltraSPARC II.
My Ultra 10 with US IIi worked though with kernel 5.14.0-3.
My 280r with US III worked with kernel 5.9.0-5 and with 5.14.0-3 gives:
```
Begin: Retrying nfs mount ... mount: Invalid argument
done.
```
...when trying to mount the root FS.
My v480 crashes with 5.14.0-3 but it crashed with every kernel version I
tried since I own it, so perfectly normal. I don't know what the issue
is, because hardware-wise, the - working with 5.9.0-5 - 280r seems to be
very similar though with only 2 processors instead of 4 for the V480.
My T5220 with T2 crashed once with 5.14.0-3 but worked with 5.14.0-4. It
later also worked with 5.14.0-3. And the crash happened way before a
mount of the root FS was tried, so possibly unrelated.
My T1000 with T1 panics with 5.14.0-3 because it can't mount the root
FS. Using `break=premount` in the kernel command line and issueing the
mount command manually gives;
```
(initramfs) nfsmount -o nolock "172.16.0.2:/srv/nfs/t1000/root" "$rootmnt"
[ 641.272949] Unable to handle kernel paging request at virtual address
0000612000000000
[ 641.273138] tsk->{mm,active_mm}->context = 000000000000038f
[ 641.273248] tsk->{mm,active_mm}->pgd = ffff800016c1c000
[ 641.273310] \|/ ____ \|/
[ 641.273310] "@'/ .. \`@"
[ 641.273310] /_| \__/ |_\
[ 641.273310] \__U_/
[ 641.273444] nfsmount(750): Oops [#182]
[ 641.273497] CPU: 12 PID: 750 Comm: nfsmount Tainted: G D E
5.14.0-3-sparc64-smp #1 Debian 5.14.12-1
[ 641.273603] TSTATE: 0000000011001607 TPC: 000000000069ce48 TNPC:
000000000069ce4c Y: 00000000 Tainted: G D E
[ 641.273705] TPC: <kfree+0x48/0x400>
[ 641.273775] g0: 0000000000000006 g1: 0000000400000000 g2:
0000600000000000 g3: ffff8001fda18000
[ 641.273858] g4: ffff800013b13340 g5: ffff8001fda18000 g6:
ffff800016bd0000 g7: ffff800016bd3c30
[ 641.273942] o0: fffffffffffffffe o1: 00000000006f4c94 o2:
0000000000002000 o3: ffff8000146d3aa8
[ 641.274024] o4: 0000000000000008 o5: 0000000000000cc0 sp:
ffff800016bd34a1 ret_pc: 00000000006f4c54
[ 641.274107] RPC: <sys_mount+0x74/0x1a0>
[ 641.274165] l0: 0000000000f1a000 l1: 000000000111f000 l2:
0000000000422db4 l3: 0000000000201db0
[ 641.274292] l4: 000000000000029c l5: ffff80010000c1a0 l6:
ffff800016bd0000 l7: 00000000006f4be0
[ 641.274377] i0: 0000000000000cc0 i1: 0000000000201fe0 i2:
0000000000000001 i3: ffff800016bd3dd0
[ 641.274460] i4: 0000000000000000 i5: 0000612000000000 i6:
ffff800016bd3561 i7: 00000000006f4c94
[ 641.274542] I7: <sys_mount+0xb4/0x1a0>
[ 641.274599] Call Trace:
[ 641.274640] [<00000000006f4c94>] sys_mount+0xb4/0x1a0
[ 641.274712] [<00000000006f4c54>] sys_mount+0x74/0x1a0
[ 641.274783] [<0000000000406274>] linux_sparc_syscall+0x34/0x44
[ 641.274866] Caller[00000000006f4c94]: sys_mount+0xb4/0x1a0
[ 641.274939] Caller[00000000006f4c54]: sys_mount+0x74/0x1a0
[ 641.275011] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44
[ 641.275090] Caller[0000000000100aa8]: 0x100aa8
[ 641.275143] Instruction DUMP:
[ 641.275150] ba074001
[ 641.275192] bb2f7003
[ 641.275233] ba074002
[ 641.275274] <c25f6008>
[ 641.275314] 84086001
[ 641.275355] 82007fff
[ 641.275395] 8378841d
[ 641.275436] ba100001
[ 641.275525] c2586008
[ 641.275614]
Killed
```
Doing the same on a V210 with US IIIi gives:
```
(initramfs) nfsmount -o nolock "172.16.0.2:/srv/nfs/v210/root" "$rootmnt"
mount: Invalid argument
(initramfs) echo $?
1
```
...so similar to 280r with US III.
From all that, I assume UltraSPARC IIi driven machines (and most likely
also older ones with US II) are not affected by this, as are UltraSPARC
T2 driven ones and possibly machines with newer processors (I didn't
have time to try one of my T5240s with T2+).
UltraSPARC III, IIIi and T1 driven machines are affected and to me it
now looks more like some of the klibc programs from the initramfs are at
fault.
I also tested my V210 with an on-disk root FS and although the mounting
seemed to work for that method with 5.14.0-3 I faced multiple problems
later on that crashed the machine.
My next try would have been to test mounting of the root FS with
non-klibc programs. But I'm unsure how to get these into an initramfs -
with dracut maybe?
Cheers,
Frank