Testing results on QDF2400 showing a recoverable DDR error, correctable
vendor specific error, correctable ARM cache error, and fatal vendor
specific error. All functionality appears to be working properly.
ubuntu@null-8cfdf006a3ef:~$ uname -a
Linux null-8cfdf006a3ef 4.10.0-29-generic #33~lp1706141+build.2-Ubuntu SMP Tue
Jul 25 19:12:22 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux
ubuntu@null-8cfdf006a3ef:~$ dmesg | grep -i -E 'hest|ghes|edac|hardware'
[ 0.000000] ACPI: HEST 0x0000000008A60000 000288 (v01 QCOM QDF2400
00000001 INTL 20150515)
[ 0.538984] HEST: Table parsing has been initialized.
[ 3.854385] EDAC MC: Ver: 3.0.0
[ 5.537078] ghes_edac: This EDAC driver relies on BIOS to enumerate memory
and get error reports.
[ 5.545952] ghes_edac: Unfortunately, not all BIOSes reflect the memory
layout correctly.
[ 5.554123] ghes_edac: So, the end result of using this driver varies from
vendor to vendor.
[ 5.562555] ghes_edac: If you find incorrect reports, please contact your
hardware vendor
[ 5.570727] ghes_edac: to correct its BIOS.
[ 5.574905] ghes_edac: This system has 6 DIMM sockets.
[ 5.580205] EDAC MC0: Giving out device to module ghes_edac.c controller
ghes_edac: DEV ghes (INTERRUPT)
[ 5.589763] EDAC MC1: Giving out device to module ghes_edac.c controller
ghes_edac: DEV ghes (INTERRUPT)
[ 5.599319] EDAC MC2: Giving out device to module ghes_edac.c controller
ghes_edac: DEV ghes (INTERRUPT)
[ 5.608867] EDAC MC3: Giving out device to module ghes_edac.c controller
ghes_edac: DEV ghes (INTERRUPT)
[ 5.618416] EDAC MC4: Giving out device to module ghes_edac.c controller
ghes_edac: DEV ghes (INTERRUPT)
[ 5.628018] GHES: APEI firmware first mode is enabled by APEI bit and WHEA
_OSC.
[ 6.573372] qcom-emac QCOM8070:00 eth0: hardware id 64.1, hardware version
1.3.0
[ 224.669058] {1}[Hardware Error]: Hardware error from APEI Generic Hardware
Error Source: 1
[ 224.677330] {1}[Hardware Error]: event severity: recoverable
[ 224.682992] {1}[Hardware Error]: precise tstamp: 2017-07-26 15:58:19
[ 224.689437] {1}[Hardware Error]: Error 0, type: recoverable
[ 224.695097] {1}[Hardware Error]: section_type: memory error
[ 224.700846] {1}[Hardware Error]: error_status: 0x00000000000c0400
[ 224.707113] {1}[Hardware Error]: physical_address: 0x0000000000204e10
[ 224.713726] {1}[Hardware Error]: physical_address_mask: 0x00000fffffffffff
[ 224.720776] {1}[Hardware Error]: node: 0 card: 1 module: 0 rank: 0 bank: 0
device: 0 row: 4 column: 306
[ 224.730427] {1}[Hardware Error]: error_type: 3, multi-bit ECC
[ 224.736356] EDAC MC0: 1 UE Multi-bit ECC on unknown label (node:0 card:1
module:0 rank:0 bank:0 row:4 col:306 page:0x204 offset:0xe10 grain:-4096 -
status(0x00000000000c0400): Storage error in DRAM memory)
[ 224.736358] [Firmware Warn]: GHES: Invalid address in generic error data:
0x204e10
[ 251.685322] {2}[Hardware Error]: Hardware error from APEI Generic Hardware
Error Source: 2
[ 251.685324] {2}[Hardware Error]: It has been corrected by h/w and requires
no further action
[ 251.685336] {2}[Hardware Error]: event severity: corrected
[ 251.685341] {2}[Hardware Error]: precise tstamp: 2017-07-26 15:58:30
[ 251.685342] {2}[Hardware Error]: Error 0, type: corrected
[ 251.685348] {2}[Hardware Error]: section type: unknown,
d2e2621c-f936-468d-0d84-15a4ed015c8b
[ 251.685349] {2}[Hardware Error]: section length: 0x238
[ 251.685355] {2}[Hardware Error]: 00000000: 4d415201 4d492031 453a4d45
435f4343 .RAM1 IMEM:ECC_C
[ 251.685358] {2}[Hardware Error]: 00000010: 53515f45 44525f42 00000000
00000000 E_QSB_RD........
[ 251.685361] {2}[Hardware Error]: 00000020: 00000000 00000000 00000000
00000000 ................
[ 251.685364] {2}[Hardware Error]: 00000030: 00000000 00000000 01010000
01010000 ................
[ 251.685367] {2}[Hardware Error]: 00000040: 00000000 00000000 00000005
00000000 ................
[ 251.685369] {2}[Hardware Error]: 00000050: 01010000 00000000 00000001
00010100 ................
[ 251.685372] {2}[Hardware Error]: 00000060: 00000000 00000000 00000000
00000000 ................
[ 251.685375] {2}[Hardware Error]: 00000070: 00000000 00000000 00000000
00000000 ................
[ 251.685378] {2}[Hardware Error]: 00000080: 00000000 00000000 00000000
00000000 ................
[ 251.685381] {2}[Hardware Error]: 00000090: 00000000 00000000 00000000
00000000 ................
[ 251.685384] {2}[Hardware Error]: 000000a0: 00000000 00000000 00000000
00000000 ................
[ 251.685387] {2}[Hardware Error]: 000000b0: 00000000 00000000 00000000
00000000 ................
[ 251.685389] {2}[Hardware Error]: 000000c0: 00000000 00000000 00000000
00000000 ................
[ 251.685392] {2}[Hardware Error]: 000000d0: 00000000 00000000 00000000
00000000 ................
[ 251.685395] {2}[Hardware Error]: 000000e0: 00000000 00000000 00000000
00000000 ................
[ 251.685398] {2}[Hardware Error]: 000000f0: 00000000 00000000 00000000
00000000 ................
[ 251.685402] {2}[Hardware Error]: 00000100: 00000000 00000000 00000000
00000000 ................
[ 251.685405] {2}[Hardware Error]: 00000110: 00000000 00000000 00000000
00000000 ................
[ 251.685408] {2}[Hardware Error]: 00000120: 00000000 00000000 00000000
00000000 ................
[ 251.685410] {2}[Hardware Error]: 00000130: 00000000 00000000 00000000
00000000 ................
[ 251.685413] {2}[Hardware Error]: 00000140: 00000000 00000000 00000000
00000000 ................
[ 251.685416] {2}[Hardware Error]: 00000150: 00000000 00000000 00000000
00000000 ................
[ 251.685419] {2}[Hardware Error]: 00000160: 00000000 00000000 00000000
00000000 ................
[ 251.685423] {2}[Hardware Error]: 00000170: 00000000 00000000 00000000
00000000 ................
[ 251.685426] {2}[Hardware Error]: 00000180: 00000000 00000000 00000000
00000000 ................
[ 251.685429] {2}[Hardware Error]: 00000190: 00000000 00000000 00000000
00000000 ................
[ 251.685432] {2}[Hardware Error]: 000001a0: 00000000 00000000 00000000
00000000 ................
[ 251.685434] {2}[Hardware Error]: 000001b0: 00000000 00000000 00000000
00000000 ................
[ 251.685437] {2}[Hardware Error]: 000001c0: 00000000 00000000 00000000
00000000 ................
[ 251.685440] {2}[Hardware Error]: 000001d0: 00000000 00000000 00000000
00000000 ................
[ 251.685443] {2}[Hardware Error]: 000001e0: 00000000 00000000 00000000
00000000 ................
[ 251.685446] {2}[Hardware Error]: 000001f0: 00000000 00000000 00000000
00000000 ................
[ 251.685449] {2}[Hardware Error]: 00000200: 00000000 00000000 00000000
00000000 ................
[ 251.685451] {2}[Hardware Error]: 00000210: 00000000 00000000 00000000
00000000 ................
[ 251.685454] {2}[Hardware Error]: 00000220: 00000000 00000000 00000000
00000000 ................
[ 251.685457] {2}[Hardware Error]: 00000230: 00000000 00000000
........
[ 357.701494] {3}[Hardware Error]: Hardware error from APEI Generic Hardware
Error Source: 2
[ 357.701496] {3}[Hardware Error]: event severity: info
[ 357.701508] {3}[Hardware Error]: precise tstamp: 2017-07-26 16:00:12
[ 357.701510] {3}[Hardware Error]: Error 0, type: info
[ 357.701513] {3}[Hardware Error]: section_type: ARM processor error
[ 357.701515] {3}[Hardware Error]: MIDR: 0x00000000510f8000
[ 357.701518] {3}[Hardware Error]: Multiprocessor Affinity Register (MPIDR):
0x0000000000000000
[ 357.701520] {3}[Hardware Error]: error affinity level: 2
[ 357.701522] {3}[Hardware Error]: running state: 0x1
[ 357.701524] {3}[Hardware Error]: Power State Coordination Interface state: 0
[ 357.701527] {3}[Hardware Error]: Error info structure 0:
[ 357.701529] {3}[Hardware Error]: num errors: 1
[ 357.701531] {3}[Hardware Error]: first error captured
[ 357.701533] {3}[Hardware Error]: last error captured
[ 357.701535] {3}[Hardware Error]: error_type: 0, cache error
[ 357.701538] {3}[Hardware Error]: error_info: 0x0000000000c20058
ubuntu@null-8cfdf006a3ef:~$
ubuntu@null-8cfdf006a3ef:~$ [ 403.857832] {4}[Hardware Error]: Hardware error
from APEI Generic Hardware Error Source: 1
[ 403.866103] {4}[Hardware Error]: event severity: fatal
[ 403.871244] {4}[Hardware Error]: precise tstamp: 2017-07-26 16:01:18
[ 403.877690] {4}[Hardware Error]: Error 0, type: fatal
[ 403.882831] {4}[Hardware Error]: section type: unknown,
d2e2621c-f936-468d-0d84-15a4ed015c8b
[ 403.891445] {4}[Hardware Error]: section length: 0x238
[ 403.896762] {4}[Hardware Error]: 00000000: 4d415201 4d492031 453a4d45
555f4343 .RAM1 IMEM:ECC_U
[ 403.905721] {4}[Hardware Error]: 00000010: 53515f45 44525f42 00000000
00000000 E_QSB_RD........
[ 403.914682] {4}[Hardware Error]: 00000020: 00000000 00000000 00000000
00000000 ................
[ 403.923644] {4}[Hardware Error]: 00000030: 00000000 00000000 01010000
01010000 ................
[ 403.932605] {4}[Hardware Error]: 00000040: 00000000 00000000 00000005
00000000 ................
[ 403.941566] {4}[Hardware Error]: 00000050: 02020000 00000000 00000001
00c6c600 ................
[ 403.950531] {4}[Hardware Error]: 00000060: 00000000 00000000 00000000
00000000 ................
[ 403.959489] {4}[Hardware Error]: 00000070: 00000000 00000000 00000000
00000000 ................
[ 403.968450] {4}[Hardware Error]: 00000080: 00000000 00000000 00000000
00000000 ................
[ 403.977413] {4}[Hardware Error]: 00000090: 00000000 00000000 00000000
00000000 ................
[ 403.986374] {4}[Hardware Error]: 000000a0: 00000000 00000000 00000000
00000000 ................
[ 403.995339] {4}[Hardware Error]: 000000b0: 00000000 00000000 00000000
00000000 ................
[ 404.004302] {4}[Hardware Error]: 000000c0: 00000000 00000000 00000000
00000000 ................
[ 404.013263] {4}[Hardware Error]: 000000d0: 00000000 00000000 00000000
00000000 ................
[ 404.022223] {4}[Hardware Error]: 000000e0: 00000000 00000000 00000000
00000000 ................
[ 404.031183] {4}[Hardware Error]: 000000f0: 00000000 00000000 00000000
00000000 ................
[ 404.040143] {4}[Hardware Error]: 00000100: 00000000 00000000 00000000
00000000 ................
[ 404.049104] {4}[Hardware Error]: 00000110: 00000000 00000000 00000000
00000000 ................
[ 404.058064] {4}[Hardware Error]: 00000120: 00000000 00000000 00000000
00000000 ................
[ 404.067025] {4}[Hardware Error]: 00000130: 00000000 00000000 00000000
00000000 ................
[ 404.075986] {4}[Hardware Error]: 00000140: 00000000 00000000 00000000
00000000 ................
[ 404.084946] {4}[Hardware Error]: 00000150: 00000000 00000000 00000000
00000000 ................
[ 404.093907] {4}[Hardware Error]: 00000160: 00000000 00000000 00000000
00000000 ................
[ 404.102867] {4}[Hardware Error]: 00000170: 00000000 00000000 00000000
00000000 ................
[ 404.111828] {4}[Hardware Error]: 00000180: 00000000 00000000 00000000
00000000 ................
[ 404.120788] {4}[Hardware Error]: 00000190: 00000000 00000000 00000000
00000000 ................
[ 404.129752] {4}[Hardware Error]: 000001a0: 00000000 00000000 00000000
00000000 ................
[ 404.138710] {4}[Hardware Error]: 000001b0: 00000000 00000000 00000000
00000000 ................
[ 404.147673] {4}[Hardware Error]: 000001c0: 00000000 00000000 00000000
00000000 ................
[ 404.156632] {4}[Hardware Error]: 000001d0: 00000000 00000000 00000000
00000000 ................
[ 404.165593] {4}[Hardware Error]: 000001e0: 00000000 00000000 00000000
00000000 ................
[ 404.174555] {4}[Hardware Error]: 000001f0: 00000000 00000000 00000000
00000000 ................
[ 404.183516] {4}[Hardware Error]: 00000200: 00000000 00000000 00000000
00000000 ................
[ 404.192476] {4}[Hardware Error]: 00000210: 00000000 00000000 00000000
00000000 ................
[ 404.201438] {4}[Hardware Error]: 00000220: 00000000 00000000 00000000
00000000 ................
[ 404.210398] {4}[Hardware Error]: 00000230: 00000000 00000000
........
[ 404.218665] Kernel panic - not syncing: Fatal hardware error!
[ 404.224406] CPU: 0 PID: 217 Comm: kworker/0:1 Not tainted 4.10.0-29-generic
#33~lp1706141+build.2-Ubuntu
[ 404.233876] Hardware name: Qualcomm Qualcomm Centriq(TM) 2400 Development
Platform/ABW|SYS|CVR,1DPC|V3 , BIOS XBL.DF.2.0.R1-00512 QDF2400_REL CR
[ 404.247695] Workqueue: kacpi_notify acpi_os_execute_deferred
[ 404.253347] Call trace:
[ 404.255790] [<ffff1e8f9e08b078>] dump_backtrace+0x0/0x2b0
[ 404.261182] [<ffff1e8f9e08b34c>] show_stack+0x24/0x30
[ 404.266230] [<ffff1e8f9e4da5e0>] dump_stack+0x9c/0xbc
[ 404.271276] [<ffff1e8f9e208620>] panic+0x140/0x2b0
[ 404.276061] [<ffff1e8f9e5ef8e0>] ghes_proc+0x1d8/0x568
[ 404.281191] [<ffff1e8f9e5efcb4>] ghes_notify_sci+0x44/0x70
[ 404.286670] [<ffff1e8f9e0f6424>] notifier_call_chain+0x5c/0xa0
[ 404.292495] [<ffff1e8f9e0f6970>] __blocking_notifier_call_chain+0x58/0xa0
[ 404.299274] [<ffff1e8f9e0f69f4>] blocking_notifier_call_chain+0x3c/0x50
[ 404.305883] [<ffff1e8f9e5ea09c>] acpi_hed_notify+0x24/0x30
[ 404.311361] [<ffff1e8f9e5b1710>] acpi_device_notify+0x30/0x40
[ 404.317101] [<ffff1e8f9e5c8204>] acpi_ev_notify_dispatch+0x4c/0x70
[ 404.323274] [<ffff1e8f9e5ac2e4>] acpi_os_execute_deferred+0x24/0x38
[ 404.329535] [<ffff1e8f9e0ed330>] process_one_work+0x158/0x478
[ 404.335273] [<ffff1e8f9e0ed6a0>] worker_thread+0x50/0x4a8
[ 404.340665] [<ffff1e8f9e0f47a8>] kthread+0x108/0x138
[ 404.345622] [<ffff1e8f9e0838a0>] ret_from_fork+0x10/0x30
[ 404.350934] SMP: stopping secondary CPUs
[ 404.356117] Starting crashdump kernel...
[ 404.360034] Bye!
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1706141
Title:
[ARM64] config EDAC_GHES=y depends on EDAC_MM_EDAC=y
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1706141/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs