Re: linux-image-amd64: failed to bring up processor 192

2023-10-14 Thread Andy Smith
Hi,

On Sat, Oct 14, 2023 at 05:15:04PM -0400, Jeffrey Mark Siskind wrote:
> When first installed, it ran kernel 6.1.0.12. That kernel found
> all 384 "CPUs". All were reported in /proc/cpuinfo. I subsequently
> did an apt upgrade which upgraded to 6.1.0.13. Upon boot, dmesg
> -lerr reports:
> 
>root@poto:~# dmesg -lerr
>[   11.080833] smpboot: do_boot_cpu failed(-1) to wakeup CPU#192
> 
> Further, /proc/cpuinfo is missing processor 192.
> 
> I subsequently installed 6.4.0-0-deb12.2-amd64 from bookwork
> backports. The issue still occurs. If I boot 6.1.0.12 I do not get the issue.

I think you should use "reportbug" to report the bug to the kernel
team.

You may like to try building a Debian kernel package from upstream
source:


https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s-kernel-org-package

If that results in a package that works for you, that's something to
report in the bug.

If it doesn't, you can go back through the 6.1 kernel releases until
you find one that does work. Then you can report that.

Once you find an upstream source version that doesn't work, you can
use "git bisect" to go back and find the version that DOES work, and
then further bisect things until you find the exact commit that
causes the issue for you.

The Debian kernel team may then be able to backport a fix to the
bookworm stable kernel, and submit it upstream if necessary.

Alternatively, the kernel team may have a way for you to step
through individual changes in between Debian packages 6.1.0.12 and
6.1.0.13.

Good hunting!

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



linux-image-amd64: failed to bring up processor 192

2023-10-14 Thread Jeffrey Mark Siskind
I purchased a new server based on the Supermicro AS-8125GS-TNHR. It
has two EPYC 9654 processors. Each processor has 96 cores and 192 hyperthreads.
Thus the system as a whole has 192 cores and 384 hyperthreads. I run
Debian stable bookworm. When first installed, it ran kernel
6.1.0.12. That kernel found all 384 "CPUs". All were reported in /proc/cpuinfo.
I subsequently did an apt upgrade which upgraded to 6.1.0.13. Upon
boot, dmesg -lerr reports:

   root@poto:~# dmesg -lerr
   [   11.080833] smpboot: do_boot_cpu failed(-1) to wakeup CPU#192

Further, /proc/cpuinfo is missing processor 192.

I subsequently installed 6.4.0-0-deb12.2-amd64 from bookwork
backports. The issue still occurs. If I boot 6.1.0.12 I do not get the issue.

There are other reasons I cannot run 6.1.0.12 that I will describe in
a different bug report.

Jeff (http: //engineering.purdue.edu/~qobi)