Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-11 Thread Gregor Riepl
> How should I proceed? Which kernel sources?
> 
> https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s-common-official
> 
> 
> is 4.3 correct for me? 4.6 ?

You should clone the upstream Git repo, otherwise bisecting will be much
more difficult.

I think these instructions are still valid:
https://wiki.debian.org/DebianKernel/GitBisect

You can also skip the Debian-specific stuff and simply do
make -j8 && make modules_install && make install

It's better to use at least a compatible kernel config, though.



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-11 Thread Frank Scheiner

Hi Riccardo,

On 11.03.21 23:03, Riccardo Mottola wrote:

Hi Frank!

I suppose the Niagara CPU gives the kernel issue


From [1] I assume T2 CPUs are not affected, but yeah, the issue could
be that selective that it only affects the very first generation.

[1]: https://lists.debian.org/debian-sparc/2021/03/msg00010.html



Frank Scheiner wrote:

If I remember there was a repository with many snapshots of different
versions, already as package, which one can test quickly. That way we
can restrict breakage range without git bisect.

Do you have a link?


I assume you mean "http://snapshot.debian.org; .


Exactly. With this I did some more tests.

Still Works:
5.9.0-4-sparc64-smp #1 SMP Debian 5.9.11-1 (2020-11-27)
5.9.0-5-sparc64-smp #1 SMP Debian 5.9.15-1 (2020-12-17)

Broken:

linux-image-5.10.0-trunk-sparc64-smp_5.10.2-1~exp1_sparc64.deb

So later series 5.9 series continue to work and even very early 5.10 do not

Do you know if I can via serial-console reset the system?


Reset from the serial console might work via the kernel with the [magic
system request] functionality.

[magic system request]:
https://www.kernel.org/doc/html/v4.11/admin-guide/sysrq.html

But you can always reset the system using the SC. The T1000 (and the
T2000, too) has both serial (on T2000 right of the DB-9 ttya port,
should work with a blue Cisco serial cable) and network port (on T2000
above the two USB ports). The serial port of the SC automatically
switches to the system console after some (configurable) time and you
need to escape to the SC login prompt with a configurable key sequence
(`#.` by default, see [2]).

[2]:
https://docs.oracle.com/cd/E19076-01/t2k.srvr/819-2549-12/ontario-consoleConfig.html#28277


I tried sending a break on the serial console, but the errors just keep
running.
Break is received, since I see it as SC Alert, but I am not put into the
console, maybe there is some further trick on these newer machine?


So you already got access to the SC. Then you can reset the machine from
there, too.


I am
used to old SparcStations and UltraSparc Netras, where it was sufficient.
It is inconvenient at every hang to power-cycle, since at every turn on,
it runs a self-test which lasts minutes :)


I think depending on the SC configuration, these machines also run a
self-test for every X resets, but this should be configurable.

Hope that helps
Cheers,
Frank



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-11 Thread Gregor Riepl


> Do you know if I can via serial-console reset the system?
> I tried sending a break on the serial console, but the errors just keep
> running.
> Break is received, since I see it as SC Alert, but I am not put into the
> console, maybe there is some further trick on these newer machine? I am
> used to old SparcStations and UltraSparc Netras, where it was sufficient.
> It is inconvenient at every hang to power-cycle, since at every turn on,
> it runs a self-test which lasts minutes :)

According to this, you should be able to reach the system console
through the SER MGT port:
https://unixed.com/index.php/2013/06/16/accessing-the-sparc-system-console/
NET MGT is probably easier, but you'll have to set it up first.

Perhaps you can also attach a USB keyboard and press the break key to
get into the system console, then type "reset" to boot the machine? Not
sure if this works without a monitor though. And you might need to enter
the system password first, if it's set.



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-11 Thread Riccardo Mottola

Hi Adrian

John Paul Adrian Glaubitz wrote:

Well, that doesn't really help you though. You want to find the commit in 
question,
just the range isn't enough to solve the issue.


Well, a little bit it helped, it is something early in the 5.10 series.
Also I have now an apparently working kernel (who knows how stable under 
load?) 5.9 series



If you have a fast second machine available, bisecting the problem shouldn't 
take
too long.


Well, this Machine has plenty of ram, disk space and good connection, 
how fast the CPU is in compiling a kernel I don't know, but we can try.
Power consumption is not so much worse than a PC, but it is darn loud! 
Like a vacuum cleaner... I need to stay out of the room, but I found an 
acceptable setup. I use a workstation with a serial console connected to 
it, the connect through ssh to the workstation and through that into the 
management.


Although I am used to compile kernels on Gentoo LInux since 15 years, I 
never did on Debian. Here we have init images



How should I proceed? Which kernel sources?

https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s-common-official

is 4.3 correct for me? 4.6 ?

Please guide me

Riccardo



Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-03-11 Thread Riccardo Mottola

Hi Frank!

I suppose the Niagara CPU gives the kernel issue

Frank Scheiner wrote:

If I remember there was a repository with many snapshots of different
versions, already as package, which one can test quickly. That way we
can restrict breakage range without git bisect.

Do you have a link?


I assume you mean "http://snapshot.debian.org; .


Exactly. With this I did some more tests.

Still Works:
5.9.0-4-sparc64-smp #1 SMP Debian 5.9.11-1 (2020-11-27)
5.9.0-5-sparc64-smp #1 SMP Debian 5.9.15-1 (2020-12-17)

Broken:

linux-image-5.10.0-trunk-sparc64-smp_5.10.2-1~exp1_sparc64.deb

So later series 5.9 series continue to work and even very early 5.10 do not

Do you know if I can via serial-console reset the system?
I tried sending a break on the serial console, but the errors just keep 
running.
Break is received, since I see it as SC Alert, but I am not put into the 
console, maybe there is some further trick on these newer machine? I am 
used to old SparcStations and UltraSparc Netras, where it was sufficient.
It is inconvenient at every hang to power-cycle, since at every turn on, 
it runs a self-test which lasts minutes :)


Riccardo