Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
On Wed, Jan 11, 2023 at 12:29:04PM +, Mark Brown wrote: > We're seeing issues in all configs on meson-gxl-s905x-libretech-cc > today, not just with the kselftest fragment. The initial failuire seems > to be: > [ 17.337253] WARNING: CPU: 3 PID: 123 at drivers/gpu/drm/drm_bridge.c:1257 > drm_bridge_hpd_enable+0x8c/0x94 [drm] > full log at: > > https://storage.kernelci.org/next/master/next-20230111/arm64/defconfig/gcc-10/lab-broonie/baseline-meson-gxl-s905x-libretech-cc.txt > and links to other logs at: > > https://linux.kernelci.org/test/job/next/branch/master/kernel/next-20230111/plan/baseline/ > Today's -next does have that fix in it so it's not fixing whatever the > original issue was, I suspect it might even be exposing other issues. > We are however still seeing the stack filling up, even with a GCC 10 > defconfig build. A bisect landed on 0e4dcffd331fa7d ("drm/panel: raspberrypi-touchscreen: Convert to i2c's .probe_new()") which is obviously not credible. I suspect that what's happening here is that the fix you applied is making an issue somewhere else visible in defconfig and is as a result confusing the bisect. Ard mentioned an issue with non-EFI biits introduced by EFI changes here: https://lore.kernel.org/linux-arm-kernel/CAMj1kXGFa=zriyp_ms7bbqr0wiwikt0objokusngpjtfvlm...@mail.gmail.com/ which seems like a plausible culprit, bisect log: git bisect start # bad: [c9e9cdd8bdcc3e1ea330d49ea587ec71884dd0f5] Add linux-next specific files for 20230111 git bisect bad c9e9cdd8bdcc3e1ea330d49ea587ec71884dd0f5 # good: [7dd4b804e08041ff56c88bdd8da742d14b17ed25] Merge tag 'nfsd-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux git bisect good 7dd4b804e08041ff56c88bdd8da742d14b17ed25 # good: [ecf8827ab7dd5731813f90146d9936165b170f32] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git git bisect good ecf8827ab7dd5731813f90146d9936165b170f32 # bad: [64208e4940ede76709f1ff5b01d1b78efc2951cf] Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git git bisect bad 64208e4940ede76709f1ff5b01d1b78efc2951cf # bad: [1077dd31ba60b39a231560beec24b97eadf8bd8f] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git git bisect bad 1077dd31ba60b39a231560beec24b97eadf8bd8f # bad: [1577a2c2aad943fbc6a5e959ae83c4ef8bc3d4de] Merge branch 'drm-next' of https://gitlab.freedesktop.org/agd5f/linux git bisect bad 1577a2c2aad943fbc6a5e959ae83c4ef8bc3d4de # good: [ec787deb2ddffc6cd6afe0e2fbbbd490ddc383ed] drm/amd: Use `amdgpu_ucode_*` helpers for GFX9 git bisect good ec787deb2ddffc6cd6afe0e2fbbbd490ddc383ed # bad: [0e4dcffd331fa7d2a6ae628b51a7f418dfa90367] drm/panel: raspberrypi-touchscreen: Convert to i2c's .probe_new() git bisect bad 0e4dcffd331fa7d2a6ae628b51a7f418dfa90367 # good: [c702545e19ebb6113d607f2a30ba2ee6cf881a3a] drm/gud: use new debugfs device-centered functions git bisect good c702545e19ebb6113d607f2a30ba2ee6cf881a3a # good: [977374cf481d3bea916b2775e6ecc682b9689550] drm/vc4: plane: Add 3:3:2 and 4:4:4:4 RGB/RGBX/RGBA formats git bisect good 977374cf481d3bea916b2775e6ecc682b9689550 # good: [67d0a30128c9f644595dfe67ac0fb941a716a6c9] drm/meson: dw-hdmi: Fix devm_regulator_*get_enable*() conversion git bisect good 67d0a30128c9f644595dfe67ac0fb941a716a6c9 # good: [29ef7605e2fd44038a70df0f46b7821464081b22] drm/i2c/sil164: Convert to i2c's .probe_new() git bisect good 29ef7605e2fd44038a70df0f46b7821464081b22 # good: [307259952625798fbea89b04aebbc5106ff18c68] drm/i2c/tda998x: Convert to i2c's .probe_new() git bisect good 307259952625798fbea89b04aebbc5106ff18c68 # good: [446757576a646eba6fae085396bdfbd74245ff28] drm/panel: olimex-lcd-olinuxino: Convert to i2c's .probe_new() git bisect good 446757576a646eba6fae085396bdfbd74245ff28 # first bad commit: [0e4dcffd331fa7d2a6ae628b51a7f418dfa90367] drm/panel: raspberrypi-touchscreen: Convert to i2c's .probe_new() signature.asc Description: PGP signature
Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
On Wed, Jan 11, 2023 at 11:34:41AM +0100, Neil Armstrong wrote: > I merged a fix that could be related: > https://lore.kernel.org/all/20230109220033.31202-1-m.szyprow...@samsung.com/ > This could make the driver to return from probe while not totally probed, and > explain such error. We're seeing issues in all configs on meson-gxl-s905x-libretech-cc today, not just with the kselftest fragment. The initial failuire seems to be: [ 17.337253] WARNING: CPU: 3 PID: 123 at drivers/gpu/drm/drm_bridge.c:1257 drm_bridge_hpd_enable+0x8c/0x94 [drm] full log at: https://storage.kernelci.org/next/master/next-20230111/arm64/defconfig/gcc-10/lab-broonie/baseline-meson-gxl-s905x-libretech-cc.txt and links to other logs at: https://linux.kernelci.org/test/job/next/branch/master/kernel/next-20230111/plan/baseline/ Today's -next does have that fix in it so it's not fixing whatever the original issue was, I suspect it might even be exposing other issues. We are however still seeing the stack filling up, even with a GCC 10 defconfig build. signature.asc Description: PGP signature
Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
Hi, On 10/01/2023 17:41, Arnd Bergmann wrote: On Tue, Jan 10, 2023, at 17:14, Naresh Kamboju wrote: [ please ignore this email if this regression already reported ] Today's Linux next tag next-20230110 boot passes with defconfig but boot fails with defconfig + kselftest merge config on arm64 devices and qemu-arm64. Reported-by: Linux Kernel Functional Testing We are bisecting this problem and get back to you shortly. GOOD: next-20230109 (defconfig + kselftests configs) BAD: next-20230110 (defconfig + kselftests configs) kernel crash log [1]: [ 15.302140] Unable to handle kernel paging request at virtual address fff8 [ 15.309906] Mem abort info: [ 15.312659] ESR = 0x9604 [ 15.316365] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.321626] SET = 0, FnV = 0 [ 15.324644] EA = 0, S1PTW = 0 [ 15.327744] FSC = 0x04: level 0 translation fault [ 15.332619] Data abort info: [ 15.335422] ISV = 0, ISS = 0x0004 [ 15.339226] CM = 0, WnR = 0 [ 15.342154] swapper pgtable: 4k pages, 48-bit VAs, pgdp=1496c000 [ 15.348795] [fff8] pgd=, p4d= [ 15.355524] Internal error: Oops: 9604 [#1] PREEMPT SMP [ 15.361729] Modules linked in: meson_gxl dwmac_generic snd_soc_meson_gx_sound_card snd_soc_meson_card_utils lima gpu_sched drm_shmem_helper meson_drm drm_dma_helper crct10dif_ce meson_ir rc_core meson_dw_hdmi dw_hdmi meson_canvas dwmac_meson8b stmmac_platform meson_rng stmmac rng_core cec meson_gxbb_wdt drm_display_helper snd_soc_meson_aiu snd_soc_meson_codec_glue pcs_xpcs snd_soc_meson_t9015 amlogic_gxl_crypto crypto_engine display_connector snd_soc_simple_amplifier drm_kms_helper drm nvmem_meson_efuse [ 15.405976] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted 6.2.0-rc3-next-20230110 #1 [ 15.413563] Hardware name: Libre Computer AML-S905X-CC (DT) [ 15.419086] Workqueue: events_unbound deferred_probe_work_func [ 15.424863] pstate: 0005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 15.431762] pc : of_drm_find_bridge+0x38/0x70 [drm] [ 15.436594] lr : of_drm_find_bridge+0x20/0x70 [drm] The line is drivers/gpu/drm/drm_bridge.c:1310: if (bridge->of_node == np) { The list_head here is a NULL pointer, so ->of_node points to address negative 8, i.e. fff8 This is linked list corruption, which typically happens as part of a use-after-free, and could be the result of a failed registration causing an object to be freed after it is added to the list. Unfortunately, there are no patches to this file between next-20230109 and next-20230110, so the bug probably is not actually in this file. [ 15.515426] Call trace: [ 15.517863] Insufficient stack space to handle exception! [ 15.517867] ESR: 0x9647 -- DABT (current EL) [ 15.517871] FAR: 0x8a047ff0 [ 15.517873] Task stack: [0x8a048000..0x8a04c000] [ 15.517877] IRQ stack: [0x88008000..0x8800c000] [ 15.517880] Overflow stack: [0x7d9c1320..0x7d9c2320] [ 15.517884] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted 6.2.0-rc3-next-20230110 #1 [ 15.517890] Hardware name: Libre Computer AML-S905X-CC (DT) [ 15.517895] Workqueue: events_unbound deferred_probe_work_func [ 15.517915] pstate: 83c5 (Nzcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 15.517923] pc : el1_abort+0x4/0x5c [ 15.517932] lr : el1h_64_sync_handler+0x60/0xac [ 15.517939] sp : 8a048020 Not sure about the missing stack trace: I can see that the stack pointer is on a task stack, which is reported as having overflown, but I don't see why it's unable to print the stack while running from the overflow stack. A stack overflow is often caused by unbounded recursion, which can happen when a device driver binds itself to a device that it has just created. The log does look a bit suspicious here, with multiple registrations for c883a000.hdmi-tx: 986 08:02:56.487871 [ 15.141218] meson-drm d010.vpu: Queued 2 outputs on vpu 987 08:02:56.493572 [ 15.141615] meson8b-dwmac c941.ethernet: Ring mode enabled 988 08:02:56.504769 [ 15.150744] meson-drm d010.vpu: bound c883a000.hdmi-tx (ops meson_dw_hdmi_ops [meson_dw_hdmi]) 989 08:02:56.515743 [ 15.154970] meson8b-dwmac c941.ethernet: Enable RX Mitigation via HW Watchdog Timer 990 08:02:56.521531 [ 15.159175] lima d00c.gpu: pp0 - mali450 version major 0 minor 0 991 08:02:56.526718 [ 15.161436] meson-drm d010.vpu: Failed to find HDMI transceiver bridge 992 08:02:56.532417 [ 15.168933] lima d00c.gpu: pp1 - mali450 version major 0 minor 0 993 08:02:56.537747 [ 15.206102] meson-drm d010.vpu: Queued 2 outputs on vpu 994 08:02:56.543435 [ 15.209608] lima d00c.gpu: pp2 - mali450 version major 0 minor 0 995 08:02:56.554307 [ 15.217027] meson-drm d010.vpu: bound c883a000.hdmi-tx (ops meson_dw_hdmi_ops
Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
On Tue, Jan 10, 2023 at 04:32:59PM +, Will Deacon wrote: > On Tue, Jan 10, 2023 at 09:44:40PM +0530, Naresh Kamboju wrote: > > GOOD: next-20230109 (defconfig + kselftests configs) > > BAD: next-20230110 (defconfig + kselftests configs) > I couldn't find a kselftests .config in the tree (assumedly I'm just ont > looking hard enough), but does that happen to enable CONFIG_STACK_TRACER=y? It's adding on all the config fragments in tools/testing/selftests/*/config which includes ftrace, which does set STACK_TRACER> > If so, since you're using clang, I wonder if this is an issue with > 68a63a412d18 ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and > CONFIG_STACK_TRACER=y")? ftrace also enables FTRACE. > Please let us know how the bisection goes... Not sure that Naresh has a bisection going, I don't think he's got direct access to such a board. signature.asc Description: PGP signature
Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
On Tue, Jan 10, 2023, at 17:14, Naresh Kamboju wrote: > [ please ignore this email if this regression already reported ] > > Today's Linux next tag next-20230110 boot passes with defconfig but > boot fails with > defconfig + kselftest merge config on arm64 devices and qemu-arm64. > > Reported-by: Linux Kernel Functional Testing > > We are bisecting this problem and get back to you shortly. > > GOOD: next-20230109 (defconfig + kselftests configs) > BAD: next-20230110 (defconfig + kselftests configs) > > kernel crash log [1]: > > [ 15.302140] Unable to handle kernel paging request at virtual > address fff8 > [ 15.309906] Mem abort info: > [ 15.312659] ESR = 0x9604 > [ 15.316365] EC = 0x25: DABT (current EL), IL = 32 bits > [ 15.321626] SET = 0, FnV = 0 > [ 15.324644] EA = 0, S1PTW = 0 > [ 15.327744] FSC = 0x04: level 0 translation fault > [ 15.332619] Data abort info: > [ 15.335422] ISV = 0, ISS = 0x0004 > [ 15.339226] CM = 0, WnR = 0 > [ 15.342154] swapper pgtable: 4k pages, 48-bit VAs, pgdp=1496c000 > [ 15.348795] [fff8] pgd=, p4d= > [ 15.355524] Internal error: Oops: 9604 [#1] PREEMPT SMP > [ 15.361729] Modules linked in: meson_gxl dwmac_generic > snd_soc_meson_gx_sound_card snd_soc_meson_card_utils lima gpu_sched > drm_shmem_helper meson_drm drm_dma_helper crct10dif_ce meson_ir > rc_core meson_dw_hdmi dw_hdmi meson_canvas dwmac_meson8b > stmmac_platform meson_rng stmmac rng_core cec meson_gxbb_wdt > drm_display_helper snd_soc_meson_aiu snd_soc_meson_codec_glue pcs_xpcs > snd_soc_meson_t9015 amlogic_gxl_crypto crypto_engine display_connector > snd_soc_simple_amplifier drm_kms_helper drm nvmem_meson_efuse > [ 15.405976] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted > 6.2.0-rc3-next-20230110 #1 > [ 15.413563] Hardware name: Libre Computer AML-S905X-CC (DT) > [ 15.419086] Workqueue: events_unbound deferred_probe_work_func > [ 15.424863] pstate: 0005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 15.431762] pc : of_drm_find_bridge+0x38/0x70 [drm] > [ 15.436594] lr : of_drm_find_bridge+0x20/0x70 [drm] The line is drivers/gpu/drm/drm_bridge.c:1310: if (bridge->of_node == np) { The list_head here is a NULL pointer, so ->of_node points to address negative 8, i.e. fff8 This is linked list corruption, which typically happens as part of a use-after-free, and could be the result of a failed registration causing an object to be freed after it is added to the list. Unfortunately, there are no patches to this file between next-20230109 and next-20230110, so the bug probably is not actually in this file. > [ 15.515426] Call trace: > [ 15.517863] Insufficient stack space to handle exception! > [ 15.517867] ESR: 0x9647 -- DABT (current EL) > [ 15.517871] FAR: 0x8a047ff0 > [ 15.517873] Task stack: [0x8a048000..0x8a04c000] > [ 15.517877] IRQ stack: [0x88008000..0x8800c000] > [ 15.517880] Overflow stack: [0x7d9c1320..0x7d9c2320] > [ 15.517884] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted > 6.2.0-rc3-next-20230110 #1 > [ 15.517890] Hardware name: Libre Computer AML-S905X-CC (DT) > [ 15.517895] Workqueue: events_unbound deferred_probe_work_func > [ 15.517915] pstate: 83c5 (Nzcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 15.517923] pc : el1_abort+0x4/0x5c > [ 15.517932] lr : el1h_64_sync_handler+0x60/0xac > [ 15.517939] sp : 8a048020 Not sure about the missing stack trace: I can see that the stack pointer is on a task stack, which is reported as having overflown, but I don't see why it's unable to print the stack while running from the overflow stack. A stack overflow is often caused by unbounded recursion, which can happen when a device driver binds itself to a device that it has just created. The log does look a bit suspicious here, with multiple registrations for c883a000.hdmi-tx: 986 08:02:56.487871 [ 15.141218] meson-drm d010.vpu: Queued 2 outputs on vpu 987 08:02:56.493572 [ 15.141615] meson8b-dwmac c941.ethernet: Ring mode enabled 988 08:02:56.504769 [ 15.150744] meson-drm d010.vpu: bound c883a000.hdmi-tx (ops meson_dw_hdmi_ops [meson_dw_hdmi]) 989 08:02:56.515743 [ 15.154970] meson8b-dwmac c941.ethernet: Enable RX Mitigation via HW Watchdog Timer 990 08:02:56.521531 [ 15.159175] lima d00c.gpu: pp0 - mali450 version major 0 minor 0 991 08:02:56.526718 [ 15.161436] meson-drm d010.vpu: Failed to find HDMI transceiver bridge 992 08:02:56.532417 [ 15.168933] lima d00c.gpu: pp1 - mali450 version major 0 minor 0 993 08:02:56.537747 [ 15.206102] meson-drm d010.vpu: Queued 2 outputs on vpu 994 08:02:56.543435 [ 15.209608] lima d00c.gpu: pp2 - mali450 version major 0 minor 0 995 08:02:56.554307 [ 15.217027] meson-drm d010.vpu: bound
Re: next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
[+ James and Nathan] On Tue, Jan 10, 2023 at 09:44:40PM +0530, Naresh Kamboju wrote: > [ please ignore this email if this regression already reported ] > > Today's Linux next tag next-20230110 boot passes with defconfig but > boot fails with > defconfig + kselftest merge config on arm64 devices and qemu-arm64. > > Reported-by: Linux Kernel Functional Testing > > We are bisecting this problem and get back to you shortly. > > GOOD: next-20230109 (defconfig + kselftests configs) > BAD: next-20230110 (defconfig + kselftests configs) I couldn't find a kselftests .config in the tree (assumedly I'm just ont looking hard enough), but does that happen to enable CONFIG_STACK_TRACER=y? If so, since you're using clang, I wonder if this is an issue with 68a63a412d18 ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")? Please let us know how the bisection goes... Will > kernel crash log [1]: > > [ 15.302140] Unable to handle kernel paging request at virtual > address fff8 > [ 15.309906] Mem abort info: > [ 15.312659] ESR = 0x9604 > [ 15.316365] EC = 0x25: DABT (current EL), IL = 32 bits > [ 15.321626] SET = 0, FnV = 0 > [ 15.324644] EA = 0, S1PTW = 0 > [ 15.327744] FSC = 0x04: level 0 translation fault > [ 15.332619] Data abort info: > [ 15.335422] ISV = 0, ISS = 0x0004 > [ 15.339226] CM = 0, WnR = 0 > [ 15.342154] swapper pgtable: 4k pages, 48-bit VAs, pgdp=1496c000 > [ 15.348795] [fff8] pgd=, p4d= > [ 15.355524] Internal error: Oops: 9604 [#1] PREEMPT SMP > [ 15.361729] Modules linked in: meson_gxl dwmac_generic > snd_soc_meson_gx_sound_card snd_soc_meson_card_utils lima gpu_sched > drm_shmem_helper meson_drm drm_dma_helper crct10dif_ce meson_ir > rc_core meson_dw_hdmi dw_hdmi meson_canvas dwmac_meson8b > stmmac_platform meson_rng stmmac rng_core cec meson_gxbb_wdt > drm_display_helper snd_soc_meson_aiu snd_soc_meson_codec_glue pcs_xpcs > snd_soc_meson_t9015 amlogic_gxl_crypto crypto_engine display_connector > snd_soc_simple_amplifier drm_kms_helper drm nvmem_meson_efuse > [ 15.405976] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted > 6.2.0-rc3-next-20230110 #1 > [ 15.413563] Hardware name: Libre Computer AML-S905X-CC (DT) > [ 15.419086] Workqueue: events_unbound deferred_probe_work_func > [ 15.424863] pstate: 0005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 15.431762] pc : of_drm_find_bridge+0x38/0x70 [drm] > [ 15.436594] lr : of_drm_find_bridge+0x20/0x70 [drm] > [ 15.441423] sp : 8a04b9b0 > [ 15.444700] x29: 8a04b9b0 x28: 08de5810 x27: > 08de5808 > [ 15.451772] x26: 08de5800 x25: 084cb8b0 x24: > 01223c00 > [ 15.458844] x23: x22: 0001 x21: > 7fa61a28 > [ 15.465917] x20: 084ca080 x19: 7fa61a28 x18: > 019bd700 > [ 15.472989] x17: 6d64685f77645f6e x16: x15: > 0004 > [ 15.480062] x14: 89bab410 x13: x12: > 0003 > [ 15.487135] x11: x10: x9 : > > [ 15.494207] x8 : 810a70a0 x7 : 64410079616b6f01 x6 : > 80416403 > [ 15.501279] x5 : 03644100 x4 : 0080 x3 : > 00416400 > [ 15.508352] x2 : 01128000 x1 : x0 : > > [ 15.515426] Call trace: > [ 15.517863] Insufficient stack space to handle exception! > [ 15.517867] ESR: 0x9647 -- DABT (current EL) > [ 15.517871] FAR: 0x8a047ff0 > [ 15.517873] Task stack: [0x8a048000..0x8a04c000] > [ 15.517877] IRQ stack: [0x88008000..0x8800c000] > [ 15.517880] Overflow stack: [0x7d9c1320..0x7d9c2320] > [ 15.517884] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted > 6.2.0-rc3-next-20230110 #1 > [ 15.517890] Hardware name: Libre Computer AML-S905X-CC (DT) > [ 15.517895] Workqueue: events_unbound deferred_probe_work_func > [ 15.517915] pstate: 83c5 (Nzcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 15.517923] pc : el1_abort+0x4/0x5c > [ 15.517932] lr : el1h_64_sync_handler+0x60/0xac > [ 15.517939] sp : 8a048020 > [ 15.517941] x29: 8a048020 x28: 01128000 x27: > 08de5808 > [ 15.517950] x26: 08de5800 x25: 8a04b608 x24: > 01128000 > [ 15.517957] x23: a0c5 x22: 880321dc x21: > 8a048180 > [ 15.517965] x20: 898e1000 x19: 8a048290 x18: > 019bd700 > [ 15.517972] x17: 0011 x16: x15: > 0004 > [ 15.517979] x14: 89bab410 x13: x12: > > [ 15.517986] x11: 0030 x10: 89013a1c x9 : > 890401a0 > [ 15.517994] x8 : 0025 x7 :
next-20230110: arm64: defconfig+kselftest config boot failed - Unable to handle kernel paging request at virtual address fffffffffffffff8
[ please ignore this email if this regression already reported ] Today's Linux next tag next-20230110 boot passes with defconfig but boot fails with defconfig + kselftest merge config on arm64 devices and qemu-arm64. Reported-by: Linux Kernel Functional Testing We are bisecting this problem and get back to you shortly. GOOD: next-20230109 (defconfig + kselftests configs) BAD: next-20230110 (defconfig + kselftests configs) kernel crash log [1]: [ 15.302140] Unable to handle kernel paging request at virtual address fff8 [ 15.309906] Mem abort info: [ 15.312659] ESR = 0x9604 [ 15.316365] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.321626] SET = 0, FnV = 0 [ 15.324644] EA = 0, S1PTW = 0 [ 15.327744] FSC = 0x04: level 0 translation fault [ 15.332619] Data abort info: [ 15.335422] ISV = 0, ISS = 0x0004 [ 15.339226] CM = 0, WnR = 0 [ 15.342154] swapper pgtable: 4k pages, 48-bit VAs, pgdp=1496c000 [ 15.348795] [fff8] pgd=, p4d= [ 15.355524] Internal error: Oops: 9604 [#1] PREEMPT SMP [ 15.361729] Modules linked in: meson_gxl dwmac_generic snd_soc_meson_gx_sound_card snd_soc_meson_card_utils lima gpu_sched drm_shmem_helper meson_drm drm_dma_helper crct10dif_ce meson_ir rc_core meson_dw_hdmi dw_hdmi meson_canvas dwmac_meson8b stmmac_platform meson_rng stmmac rng_core cec meson_gxbb_wdt drm_display_helper snd_soc_meson_aiu snd_soc_meson_codec_glue pcs_xpcs snd_soc_meson_t9015 amlogic_gxl_crypto crypto_engine display_connector snd_soc_simple_amplifier drm_kms_helper drm nvmem_meson_efuse [ 15.405976] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted 6.2.0-rc3-next-20230110 #1 [ 15.413563] Hardware name: Libre Computer AML-S905X-CC (DT) [ 15.419086] Workqueue: events_unbound deferred_probe_work_func [ 15.424863] pstate: 0005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 15.431762] pc : of_drm_find_bridge+0x38/0x70 [drm] [ 15.436594] lr : of_drm_find_bridge+0x20/0x70 [drm] [ 15.441423] sp : 8a04b9b0 [ 15.444700] x29: 8a04b9b0 x28: 08de5810 x27: 08de5808 [ 15.451772] x26: 08de5800 x25: 084cb8b0 x24: 01223c00 [ 15.458844] x23: x22: 0001 x21: 7fa61a28 [ 15.465917] x20: 084ca080 x19: 7fa61a28 x18: 019bd700 [ 15.472989] x17: 6d64685f77645f6e x16: x15: 0004 [ 15.480062] x14: 89bab410 x13: x12: 0003 [ 15.487135] x11: x10: x9 : [ 15.494207] x8 : 810a70a0 x7 : 64410079616b6f01 x6 : 80416403 [ 15.501279] x5 : 03644100 x4 : 0080 x3 : 00416400 [ 15.508352] x2 : 01128000 x1 : x0 : [ 15.515426] Call trace: [ 15.517863] Insufficient stack space to handle exception! [ 15.517867] ESR: 0x9647 -- DABT (current EL) [ 15.517871] FAR: 0x8a047ff0 [ 15.517873] Task stack: [0x8a048000..0x8a04c000] [ 15.517877] IRQ stack: [0x88008000..0x8800c000] [ 15.517880] Overflow stack: [0x7d9c1320..0x7d9c2320] [ 15.517884] CPU: 1 PID: 9 Comm: kworker/u8:0 Not tainted 6.2.0-rc3-next-20230110 #1 [ 15.517890] Hardware name: Libre Computer AML-S905X-CC (DT) [ 15.517895] Workqueue: events_unbound deferred_probe_work_func [ 15.517915] pstate: 83c5 (Nzcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 15.517923] pc : el1_abort+0x4/0x5c [ 15.517932] lr : el1h_64_sync_handler+0x60/0xac [ 15.517939] sp : 8a048020 [ 15.517941] x29: 8a048020 x28: 01128000 x27: 08de5808 [ 15.517950] x26: 08de5800 x25: 8a04b608 x24: 01128000 [ 15.517957] x23: a0c5 x22: 880321dc x21: 8a048180 [ 15.517965] x20: 898e1000 x19: 8a048290 x18: 019bd700 [ 15.517972] x17: 0011 x16: x15: 0004 [ 15.517979] x14: 89bab410 x13: x12: [ 15.517986] x11: 0030 x10: 89013a1c x9 : 890401a0 [ 15.517994] x8 : 0025 x7 : 205d363234353135 x6 : 352e35312020205b [ 15.518001] x5 : 89f766b7 x4 : 88fe695c x3 : 000c [ 15.518008] x2 : 9604 x1 : 9604 x0 : 8a048030 [ 15.518017] Kernel panic - not syncing: kernel stack overflow [ 15.518020] SMP: stopping secondary CPUs [ 15.518027] Kernel Offset: disabled [ 15.518029] CPU features: 0x0,01000100,421b [ 15.518034] Memory Limit: none [ 15.679388] ---[ end Kernel panic - not syncing: kernel stack overflow ]--- [1] https://storage.kernelci.org/next/master/next-20230110/arm64/defconfig/clang-16/lab-broonie/kselftest-arm64-meson-gxl-s905x-libretech-cc.html