Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
> From: kiltz > Date: Tue, 14 Jun 2022 13:27:03 +0200 > > Dear Mark, > first of all, thanks again for your efforts! > After testing the new snapshot, in total 64 cores are recognized (see > attached screenshot) before the kernel panics again. > Since the CPU Ampere Altra Q80-33 processor has 80 cores, I suspect > there are only two possibilities - either after initializing 64 cores > the kernel we hit a brick wall of sorts or somehow the limit was set > to 64 cores? > Best wishes, Hi Stefan, I don't understand what's happening here. To help us out can you: 1. Boot the machine with a single-processor kernel by typing "bsd.sp" at the boot> prompt? 2. Send me the output of the "eeprom -p" command. 3. Send me the files in the /var/db/acpi directory. 4. Send the dmesg output. Feel free to send me (kette...@openbsd.org) and Patrick (patr...@openbsd.org) that output in private if you have concerns sending it to a public mailing list. Thanks, Mark
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
Hello all, thanks even once more for the support, you people are truly great! I will have access to the machine in the second part of the day and will for sure give it a spin! And of course, if all works out, then of course I will provide you with a dmesg of the successful start. I will definitely keep you posted! Best wishes, Stefan On Mon, June 13, 2022 11:08 pm, Patrick Wildt wrote: > Am Mon, Jun 13, 2022 at 07:44:24PM +0200 schrieb Mark Kettenis: > >>> From: kiltz >>> Date: Mon, 13 Jun 2022 18:12:27 +0200 >>> >>> >>> Dear Mark, >>> first of all, thank you very much for your explainations, the diff and, >>> indeed, the ultra swift reply! That helps us a lot already. >>> A snapshot with a higher value of max CPUs out of the box, of course, >>> would be the proverbial icing on the cake. Probably a strange question >>> but I hazard it anyways - should we monitor the snapshot directory the >>> /pub/OpenBSD/snapshots folder or is >>> there a quicker way to find out what your fellow developers think? >>> Again, many thanks for your help and best wishes, >>> >> >> Hi Stefan, >> >> >> Theo put that diff in snaphots. I suspect that tomorrow's snapshot >> will have it. You can easily tell, since all 80 CPUs should attach with >> that diff. >> >> Cheers, >> >> >> Mark >> > > And it's nice to hear that SP install already worked. I remember > booting it up on an Oracle machine with an Ampere Altra which led to > messages like > > agintcmsi0 at agintc0: unsupported type 0x001700026f31 > > See http://ix.io/3GEX > > > While I had a diff somewhere to 'fix that', I never got the timer > interrupt to fire. > > That you already had an SP install means all that should be fine. > If this change/new snap works, I'd be interested to read a full > dmesg! > > Cheers, > Patrick > >
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
Hello all, thanks even once more for the support, you people are truly great! I will have access to the machine in the second part of the day and will for sure give it a spin! And of course, if all works out, then of course I will provide you with a dmesg of the successful start. I will definitely keep you posted! Best wishes, Stefan On Mon, June 13, 2022 11:08 pm, Patrick Wildt wrote: > Am Mon, Jun 13, 2022 at 07:44:24PM +0200 schrieb Mark Kettenis: > >>> From: kiltz >>> Date: Mon, 13 Jun 2022 18:12:27 +0200 >>> >>> >>> Dear Mark, >>> first of all, thank you very much for your explainations, the diff and, >>> indeed, the ultra swift reply! That helps us a lot already. >>> A snapshot with a higher value of max CPUs out of the box, of course, >>> would be the proverbial icing on the cake. Probably a strange question >>> but I hazard it anyways - should we monitor the snapshot directory the >>> /pub/OpenBSD/snapshots folder or is >>> there a quicker way to find out what your fellow developers think? >>> Again, many thanks for your help and best wishes, >>> >> >> Hi Stefan, >> >> >> Theo put that diff in snaphots. I suspect that tomorrow's snapshot >> will have it. You can easily tell, since all 80 CPUs should attach with >> that diff. >> >> Cheers, >> >> >> Mark >> > > And it's nice to hear that SP install already worked. I remember > booting it up on an Oracle machine with an Ampere Altra which led to > messages like > > agintcmsi0 at agintc0: unsupported type 0x001700026f31 > > See http://ix.io/3GEX > > > While I had a diff somewhere to 'fix that', I never got the timer > interrupt to fire. > > That you already had an SP install means all that should be fine. > If this change/new snap works, I'd be interested to read a full > dmesg! > > Cheers, > Patrick > >
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
Am Mon, Jun 13, 2022 at 07:44:24PM +0200 schrieb Mark Kettenis: > > From: kiltz > > Date: Mon, 13 Jun 2022 18:12:27 +0200 > > > > Dear Mark, > > first of all, thank you very much for your explainations, the diff > > and, indeed, the ultra swift reply! > > That helps us a lot already. > > A snapshot with a higher value of max CPUs out of the box, of course, > > would be the proverbial icing on the cake. > > Probably a strange question but I hazard it anyways - should we > > monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is > > there a quicker way to find out what your fellow developers think? > > Again, many thanks for your help and best wishes, > > Hi Stefan, > > Theo put that diff in snaphots. I suspect that tomorrow's snapshot > will have it. You can easily tell, since all 80 CPUs should attach > with that diff. > > Cheers, > > Mark And it's nice to hear that SP install already worked. I remember booting it up on an Oracle machine with an Ampere Altra which led to messages like agintcmsi0 at agintc0: unsupported type 0x001700026f31 See http://ix.io/3GEX While I had a diff somewhere to 'fix that', I never got the timer interrupt to fire. That you already had an SP install means all that should be fine. If this change/new snap works, I'd be interested to read a full dmesg! Cheers, Patrick
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
> From: kiltz > Date: Mon, 13 Jun 2022 18:12:27 +0200 > > Dear Mark, > first of all, thank you very much for your explainations, the diff > and, indeed, the ultra swift reply! > That helps us a lot already. > A snapshot with a higher value of max CPUs out of the box, of course, > would be the proverbial icing on the cake. > Probably a strange question but I hazard it anyways - should we > monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is > there a quicker way to find out what your fellow developers think? > Again, many thanks for your help and best wishes, Hi Stefan, Theo put that diff in snaphots. I suspect that tomorrow's snapshot will have it. You can easily tell, since all 80 CPUs should attach with that diff. Cheers, Mark > - > Dr.-Ing. Stefan Kiltz > > Otto-von-Guericke University of Magdeburg > ITI Research Group on > Multimedia and Security > Universitaetsplatz 2 > 39106 Magdeburg > Germany > > Tel: +49-391-67-52838 > Fax: +49-391-67-18110 > > eMail: ki...@iti.cs.uni-magdeburg.de > > > > > > On 13 Jun 2022, at 17:20, Mark Kettenis wrote: > > >> From: kiltz > >> Date: Mon, 13 Jun 2022 14:46:39 +0200 > > > > Hi Stefan, > > > >> Dear kind people at OpenBSD.org, > >> we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 > >> with the following specifications: > >> > >>Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) > >>512 GB RAM (3200 MHz ECC-reg.) > >>2 x 480 GB SSD SATA 6 Gb/s 2,5'' > >>Dual-Port 1 GbE (RJ-45) > >>IPMI 2.0 Baseboard Management Controller (BMC) > >> 1 x PCIe4.0 x16 (FHHL) > >>1 x PCIe3.0 x16 OCP2.0 (belegt) > >>1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) > >> > >> We tried both: > >> - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and > >> - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) > >> > >> The repeatable result is a working install in single CPU/Core > >> installation mode, cpu panic after first reboot with mp kernel. We > >> use > >> the serial to LAN console provided by the IMPI/BMC card. > >> Attached you will find screenshots from: > >> > >> - the last 49 columns of the reboot into mp kernel > >> (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 > >> 13-51-00.png), > >> - the ddb trace output (Screenshot ddb_trace_2022-06-13 > >> 14-02-11.png), > >> - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), > >> - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 > >> 14-04-28.png) > >> - the ddb show registers output (Screenshot ddb_show_registers_at > >> 2022-06-13 14-06-34.png) > >> > >> Due to the nature of the early boot panic, the kernel output is not > >> accessible to us. > >> > >> Interestingly, FreeBSD only supports them in their current release, > >> the stable fails with a similar panic. They seem to have found a fix > >> of sorts. But we very much prefer OpenBSD for the firewalling role of > >> aforementioned system. > >> > >> Of course we support your effort so if you need more info from us > >> regarding the circumstances, we will happily try and supply the > >> required information. > > > > The immediate problem is that OpenBSD currently supports a maximum of > > 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to > > 128. You could try building a GENERIC.MP kernel with this diff after > > booting the GENERIC (bsd.sp) single-processor kernel. I'll see what > > my fellow developers think abut bumping MAXCPUS. Depending on the > > outcome of that a snapshot with this change may be available in a few > > days. > > > > I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not > > very well but I guess there is only one way to find out... > > > > Cheers, > > > > Mark > > > > > > Index: arch/arm64/include/cpu.h > > === > > RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v > > retrieving revision 1.25 > > diff -u -p -r1.25 cpu.h > > --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 > > +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - > > @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; > > #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ > > ci != NULL; ci = ci->ci_next) > > #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) > > -#define MAXCPUS32 > > +#define MAXCPUS128 > > > > extern struct cpu_info *cpu_info[MAXCPUS]; > > > > -BEGIN PGP SIGNATURE- > Version: GnuPG/MacGPG2 v2.0.14 (Darwin) > > iEYEARECAAYFAmKnYesACgkQuLKZPfaiT0iDDgCfXC6QIWGHzkMyWxPKHCaTkYwR > AXUAnjLiJX1RyuqrMejk4AT2s5X99fmi > =pRhT > -END PGP SIGNATURE- >
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Mark, first of all, thank you very much for your explainations, the diff and, indeed, the ultra swift reply! That helps us a lot already. A snapshot with a higher value of max CPUs out of the box, of course, would be the proverbial icing on the cake. Probably a strange question but I hazard it anyways - should we monitor the snapshot directory the /pub/OpenBSD/snapshots folder or is there a quicker way to find out what your fellow developers think? Again, many thanks for your help and best wishes, Stefan - Dr.-Ing. Stefan Kiltz Otto-von-Guericke University of Magdeburg ITI Research Group on Multimedia and Security Universitaetsplatz 2 39106 Magdeburg Germany Tel: +49-391-67-52838 Fax: +49-391-67-18110 eMail: ki...@iti.cs.uni-magdeburg.de On 13 Jun 2022, at 17:20, Mark Kettenis wrote: From: kiltz Date: Mon, 13 Jun 2022 14:46:39 +0200 Hi Stefan, Dear kind people at OpenBSD.org, we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 with the following specifications: Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) 512 GB RAM (3200 MHz ECC-reg.) 2 x 480 GB SSD SATA 6 Gb/s 2,5'' Dual-Port 1 GbE (RJ-45) IPMI 2.0 Baseboard Management Controller (BMC) 1 x PCIe4.0 x16 (FHHL) 1 x PCIe3.0 x16 OCP2.0 (belegt) 1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) We tried both: - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) The repeatable result is a working install in single CPU/Core installation mode, cpu panic after first reboot with mp kernel. We use the serial to LAN console provided by the IMPI/BMC card. Attached you will find screenshots from: - the last 49 columns of the reboot into mp kernel (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 13-51-00.png), - the ddb trace output (Screenshot ddb_trace_2022-06-13 14-02-11.png), - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 14-04-28.png) - the ddb show registers output (Screenshot ddb_show_registers_at 2022-06-13 14-06-34.png) Due to the nature of the early boot panic, the kernel output is not accessible to us. Interestingly, FreeBSD only supports them in their current release, the stable fails with a similar panic. They seem to have found a fix of sorts. But we very much prefer OpenBSD for the firewalling role of aforementioned system. Of course we support your effort so if you need more info from us regarding the circumstances, we will happily try and supply the required information. The immediate problem is that OpenBSD currently supports a maximum of 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to 128. You could try building a GENERIC.MP kernel with this diff after booting the GENERIC (bsd.sp) single-processor kernel. I'll see what my fellow developers think abut bumping MAXCPUS. Depending on the outcome of that a snapshot with this change may be available in a few days. I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not very well but I guess there is only one way to find out... Cheers, Mark Index: arch/arm64/include/cpu.h === RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v retrieving revision 1.25 diff -u -p -r1.25 cpu.h --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ ci != NULL; ci = ci->ci_next) #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) -#define MAXCPUS32 +#define MAXCPUS128 extern struct cpu_info *cpu_info[MAXCPUS]; -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) iEYEARECAAYFAmKnYesACgkQuLKZPfaiT0iDDgCfXC6QIWGHzkMyWxPKHCaTkYwR AXUAnjLiJX1RyuqrMejk4AT2s5X99fmi =pRhT -END PGP SIGNATURE-
Re: Bug report for Gigabyte_R152_P30 - Multiprocessor boot fails with kernel panic
> From: kiltz > Date: Mon, 13 Jun 2022 14:46:39 +0200 Hi Stefan, > Dear kind people at OpenBSD.org, > we want to run OpenBSD as a firewall system on a Gigabyte R152_P30 > with the following specifications: > > Ampere Altra Q80-33 processor (80 Cores, 3,3 GHz) > 512 GB RAM (3200 MHz ECC-reg.) > 2 x 480 GB SSD SATA 6 Gb/s 2,5'' > Dual-Port 1 GbE (RJ-45) > IPMI 2.0 Baseboard Management Controller (BMC) > 1 x PCIe4.0 x16 (FHHL) > 1 x PCIe3.0 x16 OCP2.0 (belegt) > 1 x USB 3.0 (front), 3 x USB 3.0 (rear), 1 x VGA (rear) > > We tried both: > - official stable 7.1 (/pub/OpenBSD/7.1/arm64) and > - snapshot from 6th of June 2022 (/pub/OpenBSD/snapshots/arm64) > > The repeatable result is a working install in single CPU/Core > installation mode, cpu panic after first reboot with mp kernel. We use > the serial to LAN console provided by the IMPI/BMC card. > Attached you will find screenshots from: > > - the last 49 columns of the reboot into mp kernel > (Screenshot_boot_after_install_Gigabyte_R152_P30 at 2022-06-13 > 13-51-00.png), > - the ddb trace output (Screenshot ddb_trace_2022-06-13 14-02-11.png), > - the ddb ps output (Screenshot ddb_ps_at 2022-06-13 14-03-25.png), > - the ddb show panic output (Screenshot ddb_show_panic_at 2022-06-13 > 14-04-28.png) > - the ddb show registers output (Screenshot ddb_show_registers_at > 2022-06-13 14-06-34.png) > > Due to the nature of the early boot panic, the kernel output is not > accessible to us. > > Interestingly, FreeBSD only supports them in their current release, > the stable fails with a similar panic. They seem to have found a fix > of sorts. But we very much prefer OpenBSD for the firewalling role of > aforementioned system. > > Of course we support your effort so if you need more info from us > regarding the circumstances, we will happily try and supply the > required information. The immediate problem is that OpenBSD currently supports a maximum of 32 CPUs. That limit is a bit arbitrary, so the diff below bumps it to 128. You could try building a GENERIC.MP kernel with this diff after booting the GENERIC (bsd.sp) single-processor kernel. I'll see what my fellow developers think abut bumping MAXCPUS. Depending on the outcome of that a snapshot with this change may be available in a few days. I'm not sure how well OpenBSD/arm64 scales to 80 CPUs. Probably not very well but I guess there is only one way to find out... Cheers, Mark Index: arch/arm64/include/cpu.h === RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v retrieving revision 1.25 diff -u -p -r1.25 cpu.h --- arch/arm64/include/cpu.h23 Mar 2022 23:36:35 - 1.25 +++ arch/arm64/include/cpu.h13 Jun 2022 15:09:32 - @@ -184,7 +184,7 @@ extern struct cpu_info *cpu_info_list; #define CPU_INFO_FOREACH(cii, ci) for (cii = 0, ci = cpu_info_list; \ ci != NULL; ci = ci->ci_next) #define CPU_INFO_UNIT(ci) ((ci)->ci_dev ? (ci)->ci_dev->dv_unit : 0) -#define MAXCPUS32 +#define MAXCPUS128 extern struct cpu_info *cpu_info[MAXCPUS];