https://bugzilla.kernel.org/show_bug.cgi?id=214469

            Bug ID: 214469
           Summary: apei_smca_report_x86_error BUG: using
                    smp_processor_id() in preemptible [00000000] code:
                    swapper/0/1
           Product: ACPI
           Version: 2.5
    Kernel Version: 5.14.2
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: acpi_ot...@kernel-bugs.osdl.org
          Reporter: nicholas.a.fr...@gmail.com
        Regression: No

System reboots and spits out the BERT entry for an unexpected / firmware
initiated reboot. This is great, except, following "MSR Address" there should
be an entry for "Register Array" which contains the information needed to
decode the error. Instead, we see "BUG: using smp_processor_id() in preemptible
[00000000] code: swapper/0/1" followed by a stack trace.

I've filed this under ACPI, since mce_setup is being called by
apei_smca_report_x86_error, and I believe APEI is a function of ACPI, but
please  move/reclassify it as needed.

System is AMD Ryzen Threadripper PRO 3975WX 32-Cores, 128GB of ECC memory,
running "Linux 5.14.2-gentoo-x86_64 #2 SMP PREEMPT"

Please let me know if you need more information and I will provide it.

--

[  +0.000234] BERT: Error records from previous boot:
[  +0.000001] [Hardware Error]: event severity: fatal
[  +0.000001] [Hardware Error]:  Error 0, type: fatal
[  +0.000000] [Hardware Error]:  fru_text: ProcessorError
[  +0.000001] [Hardware Error]:   section_type: IA32/X64 processor error
[  +0.000001] [Hardware Error]:   Local APIC_ID: 0x3f
[  +0.000001] [Hardware Error]:   CPUID Info:
[  +0.000001] [Hardware Error]:   00000000: 00830f10 00000000 3f400800 00000000
[  +0.000001] [Hardware Error]:   00000010: 76d8320b 00000000 178bfbff 00000000
[  +0.000001] [Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
[  +0.000001] [Hardware Error]:   Error Information Structure 0:
[  +0.000000] [Hardware Error]:    Error Structure Type: cache error
[  +0.000001] [Hardware Error]:    Check Information: 0x00000000064d001f
[  +0.000001] [Hardware Error]:     Transaction Type: 1, Data Access
[  +0.000000] [Hardware Error]:     Operation: 3, data read
[  +0.000001] [Hardware Error]:     Level: 1
[  +0.000000] [Hardware Error]:     Processor Context Corrupt: true
[  +0.000001] [Hardware Error]:     Uncorrected: true
[  +0.000000] [Hardware Error]:   Context Information Structure 0:
[  +0.000001] [Hardware Error]:    Register Context Type: MSR Registers
(Machine Check and other MSRs)
[  +0.000000] [Hardware Error]:    Register Array Size: 0x0050
[  +0.000001] [Hardware Error]:    MSR Address: 0xc0002001
[  +0.000001] BUG: using smp_processor_id() in preemptible [00000000] code:
swapper/0/1
[  +0.000000] caller is mce_setup+0x32/0x100
[  +0.000004] CPU: 32 PID: 1 Comm: swapper/0 Not tainted 5.14.2-gentoo-x86_64
#2 ce26423c1c94c39ff48a0733dda399ae462242aa
[  +0.000002] Hardware name: ASUS System Product Name/Pro WS WRX80E-SAGE SE
WIFI, BIOS 0602 07/13/2021
[  +0.000001] Call Trace:
[  +0.000002]  dump_stack_lvl+0x34/0x44
[  +0.000003]  check_preemption_disabled+0xd8/0xe0
[  +0.000002]  mce_setup+0x32/0x100
[  +0.000002]  apei_smca_report_x86_error+0x68/0x140
[  +0.000003]  cper_print_proc_ia.cold+0x3e4/0x5f8
[  +0.000002]  ? vprintk_emit+0xf7/0x1a0
[  +0.000002]  ? em_dev_register_perf_domain.cold+0x11/0xa3
[  +0.000003]  cper_estatus_print_section+0x813/0x9ea
[  +0.000002]  ? snprintf+0x49/0x60
[  +0.000003]  cper_estatus_print+0xad/0xe8
[  +0.000001]  bert_init+0x1b2/0x214
[  +0.000004]  ? setup_bert_disable+0x12/0x12
[  +0.000001]  do_one_initcall+0x41/0x1f0
[  +0.000002]  kernel_init_freeable+0x1fe/0x265
[  +0.000003]  ? rest_init+0xd0/0xd0
[  +0.000001]  kernel_init+0x16/0x110
[  +0.000001]  ret_from_fork+0x1f/0x30
[  +0.000007] [Hardware Error]:  Error 1, type: recoverable
[  +0.000001] [Hardware Error]:  fru_text: PcieError
[  +0.000001] [Hardware Error]:   section_type: PCIe error
[  +0.000000] [Hardware Error]:   port_type: 4, root port
[  +0.000001] [Hardware Error]:   version: 0.2
[  +0.000000] [Hardware Error]:   command: 0x0003, status: 0x0010
[  +0.000001] [Hardware Error]:   device_id: 0000:00:03.1
[  +0.000001] [Hardware Error]:   slot: 0
[  +0.000000] [Hardware Error]:   secondary_bus: 0x01
[  +0.000001] [Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
[  +0.000000] [Hardware Error]:   class_code: 060400
[  +0.000001] [Hardware Error]:   bridge: secondary_status: 0x2000, control:
0x0010
[  +0.000018] mce: [Hardware Error]: Machine check events logged
[  +0.000001] mce: [Hardware Error]: CPU 63: Machine Check: 0 Bank 0:
be802800000c0135
[  +0.000018] mce: [Hardware Error]: TSC 0 ADDR 100fffdfc001080 MISC
d01c0ff500000000 IPID b000000000 
[  +0.000018] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1632132503 SOCKET
0 APIC 3f microcode 830104d

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

_______________________________________________
acpi-bugzilla mailing list
acpi-bugzilla@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to