I was able to fix boot error on t3a (AMD EPYC based) instances (kernel: protection fault trap at lapic_set_lvt:rdmsr) with this patch (tested against 6.9):

Index: arch/amd64/amd64/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
retrieving revision 1.57
diff -u -p -r1.57 lapic.c
--- arch/amd64/amd64/lapic.c    6 Sep 2020 20:50:00 -0000 1.57
+++ arch/amd64/amd64/lapic.c    16 May 2021 15:25:55 -0000
@@ -300,7 +300,8 @@ lapic_set_lvt(void)
                 *   #32559 revision 3.00
                 */
                if ((cpu_id & 0x00000f00) == 0x00000f00 &&
-                   (cpu_id & 0x0fff0000) >= 0x00040000) {
+                   (cpu_id & 0x0fff0000) >= 0x00040000 &&
+                   (cpu_id & 0x0fff0000) < 0x00800000) {
                        uint64_t msr;

                        msr = rdmsr(MSR_INT_PEN_MSG);

It seems EPYC CPUs no longer needs the workaround, which is being applied here.

Of course OS wasn't able to boot completely - NVMe driver doesn't work ("unable to create io q"), no NIC support.

On 10/7/2020 7:01 PM, Kirill Peskov wrote:
OK, looks like ENA (Elastic Network Adapter) is the main show stopper here,

There is a glimpse of optimism here, FreeBSD port of ENA driver is
already out there:

https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena <https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena>

I'm trying to catch the AMD-specific crash logs from t3a-type instances
to post them here.

Reply via email to