Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Tue, Dec 1, 2020 at 12:19 AM Greg Kroah-Hartman wrote: > > On Mon, Nov 30, 2020 at 12:12:39PM -0800, Nick Desaulniers wrote: > > On Wed, Nov 25, 2020 at 10:38 PM Greg Kroah-Hartman > > wrote: > > > > > > Is the mainline 4.9 tree supposed to work with clang? I didn't think > > > that upstream effort started until 4.19 or so. > > > > (For historical records, separate from the initial bug report that > > started this thread) > > > > I consider 785f11aa595b ("kbuild: Add better clang cross build > > support") to be the starting point of a renewed effort to upstream > > clang support. 785f11aa595b landed in v4.12-rc1. I think most patches > > landed between there and 4.15 (would have been my guess). From there, > > support was backported to 4.14, 4.9, and 4.4 for x86_64 and aarch64. > > We still have CI coverage of those branches+arches with Clang today. > > Pixel 2 shipped with 4.4+clang, Pixel 3 and 3a with 4.9+clang, Pixel 4 > > and 4a with 4.14+clang. CrOS has also shipped clang built kernels > > since 4.4+. > > Thanks for the info. Naresh, does this help explain why maybe testing > these kernel branches with clang might not be the best thing to do? On the contrary, I think it's very much worthwhile to test these branches with Clang. Particularly since CrOS is shipping x86_64 devices built with Clang since 4.4.y. This looks like a problem that's potentially been fixed but the fix not yet identified and backported. It would be good for us to identify and fix the issue before it becomes a problem for CrOS. Though, it looks like CrOS just skipped 4.9...? Looking at: https://chromium.googlesource.com/chromiumos/third_party/kernel/+refs I don't see a chromeos-4.9 branch. That said, I still find such reports helpful to track. -- Thanks, ~Nick Desaulniers
Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Tue, 1 Dec 2020 at 13:49, Greg Kroah-Hartman wrote: > > On Mon, Nov 30, 2020 at 12:12:39PM -0800, Nick Desaulniers wrote: > > > > (For historical records, separate from the initial bug report that > > started this thread) > > > > I consider 785f11aa595b ("kbuild: Add better clang cross build > > support") to be the starting point of a renewed effort to upstream > > clang support. 785f11aa595b landed in v4.12-rc1. I think most patches > > landed between there and 4.15 (would have been my guess). From there, > > support was backported to 4.14, 4.9, and 4.4 for x86_64 and aarch64. > > We still have CI coverage of those branches+arches with Clang today. > > Pixel 2 shipped with 4.4+clang, Pixel 3 and 3a with 4.9+clang, Pixel 4 > > and 4a with 4.14+clang. CrOS has also shipped clang built kernels > > since 4.4+. > > Thanks for the info. Naresh, does this help explain why maybe testing > these kernel branches with clang might not be the best thing to do? It is clear now. FYI, With this note LKFT will not test 4.14+clang and old branches. - Naresh
Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Mon, Nov 30, 2020 at 12:12:39PM -0800, Nick Desaulniers wrote: > On Wed, Nov 25, 2020 at 10:38 PM Greg Kroah-Hartman > wrote: > > > > Is the mainline 4.9 tree supposed to work with clang? I didn't think > > that upstream effort started until 4.19 or so. > > (For historical records, separate from the initial bug report that > started this thread) > > I consider 785f11aa595b ("kbuild: Add better clang cross build > support") to be the starting point of a renewed effort to upstream > clang support. 785f11aa595b landed in v4.12-rc1. I think most patches > landed between there and 4.15 (would have been my guess). From there, > support was backported to 4.14, 4.9, and 4.4 for x86_64 and aarch64. > We still have CI coverage of those branches+arches with Clang today. > Pixel 2 shipped with 4.4+clang, Pixel 3 and 3a with 4.9+clang, Pixel 4 > and 4a with 4.14+clang. CrOS has also shipped clang built kernels > since 4.4+. Thanks for the info. Naresh, does this help explain why maybe testing these kernel branches with clang might not be the best thing to do? greg k-h
Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Wed, Nov 25, 2020 at 10:38 PM Greg Kroah-Hartman wrote: > > Is the mainline 4.9 tree supposed to work with clang? I didn't think > that upstream effort started until 4.19 or so. (For historical records, separate from the initial bug report that started this thread) I consider 785f11aa595b ("kbuild: Add better clang cross build support") to be the starting point of a renewed effort to upstream clang support. 785f11aa595b landed in v4.12-rc1. I think most patches landed between there and 4.15 (would have been my guess). From there, support was backported to 4.14, 4.9, and 4.4 for x86_64 and aarch64. We still have CI coverage of those branches+arches with Clang today. Pixel 2 shipped with 4.4+clang, Pixel 3 and 3a with 4.9+clang, Pixel 4 and 4a with 4.14+clang. CrOS has also shipped clang built kernels since 4.4+. -- Thanks, ~Nick Desaulniers
Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Thu, Nov 26, 2020 at 07:39:33AM +0100, Greg Kroah-Hartman wrote: > On Thu, Nov 26, 2020 at 10:14:43AM +0530, Naresh Kamboju wrote: > > Linaro recently started building and testing with stable branches with > > clang. > > Stable 4.9 branch kernel built with clang 10 boot crashed on x86 and > > qemu_x86. > > We do not have base line results to compare with. > > > > steps to build and boot: > > # build kernel with tuxmake > > # sudo pip3 install -U tuxmake > > # tuxmake --runtime docker --target-arch x86 --toolchain clang-10 > > --kconfig defconfig --kconfig-add > > https://builds.tuxbuild.com/1kgtX7QEDmhvj6OfbZBdlGaEple/config > > # boot qemu_x86_64 > > # /usr/bin/qemu-system-x86_64 -cpu host -enable-kvm -nographic -net > > nic,model=virtio,macaddr=DE:AD:BE:EF:66:14 -net tap -m 1024 -monitor > > none -kernel kernel/bzImage --append "root=/dev/sda rootwait > > console=ttyS0,115200" -hda > > rootfs/rpb-console-image-lkft-intel-corei7-64-20201022181159-3085.rootfs.ext4 > > -m 4096 -smp 4 -nographic > > > > Crash log: > > --- > > [ 14.121499] Freeing unused kernel memory: 1896K > > [ 14.126962] random: fast init done > > [ 14.206005] PANIC: double fault, error_code: 0x0 > > [ 14.210633] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 > > [ 14.216809] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > > 2.2 05/23/2018 > > [ 14.224196] task: 88026e2c task.stack: c902 > > [ 14.230105] RIP: 0010:[] [] > > proc_dostring+0x13b/0x1e0 > > [ 14.238374] RSP: 0018:000c EFLAGS: 00010297 > > [ 14.243676] RAX: 5638939fb850 RBX: 000c RCX: > > 5638939fb850 > > [ 14.250799] RDX: 000c RSI: RDI: > > 007f > > [ 14.257925] RBP: c9023d98 R08: c9023ef8 R09: > > 5638939fb850 > > [ 14.265049] R10: R11: 8117f9e0 R12: > > 82479cf0 > > [ 14.272171] R13: c9023ef8 R14: c9023dd8 R15: > > 007f > > [ 14.279298] FS: 7f57fbce8840() GS:88027788() > > knlGS: > > [ 14.287384] CS: 0010 DS: ES: CR0: 80050033 > > [ 14.293120] CR2: fff8 CR3: 00026d58a000 CR4: > > 00360670 > > [ 14.300243] DR0: DR1: DR2: > > > > [ 14.307368] DR3: DR6: fffe0ff0 DR7: > > 0400 > > [ 14.314491] Stack: > > [ 14.316504] Call Trace: > > [ 14.318955] Code: c3 49 8b 10 31 f6 48 01 da 49 89 10 49 83 3e 00 > > 74 49 41 83 c7 ff 49 63 ff 4c 89 c9 0f 1f 40 00 48 39 fe 73 36 48 89 > > c8 48 89 dc b0 9d 3a 00 85 c0 0f 85 8c 00 00 00 84 d2 74 1f 80 fa > > 0a 74 > > [ 14.338906] Kernel panic - not syncing: Machine halted. > > [ 14.344123] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 > > [ 14.350291] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > > 2.2 05/23/2018 > > [ 14.357677] 880277888e80 81518ae9 880277888e98 > > 82971a10 > > [ 14.365129] 000f 0086 > > 820c5d57 > > [ 14.372584] 880277888f08 81175736 0038 > > 880277888f18 > > [ 14.380038] Call Trace: > > [ 14.382481] <#DF> [ 14.384406] [] > > dump_stack+0xa9/0x100 > > [ 14.389641] [] panic+0xe6/0x2a0 > > [ 14.394432] [] df_debug+0x31/0x40 > > [ 14.399389] [] do_double_fault+0x102/0x140 > > [ 14.405128] [] double_fault+0x27/0x30 > > [ 14.410440] [] ? proc_put_long+0xc0/0xc0 > > [ 14.416004] [] ? proc_dostring+0x13b/0x1e0 > > [ 14.421739] [ 14.423703] Kernel Offset: disabled > > [ 14.427209] ---[ end Kernel panic - not syncing: Machine halted. > > > > Reported-by: Naresh Kamboju > > > > full test log, > > https://lkft.validation.linaro.org/scheduler/job/1978901#L916 > > https://lkft.validation.linaro.org/scheduler/job/1980839#L578 > > Is the mainline 4.9 tree supposed to work with clang? I didn't think > that upstream effort started until 4.19 or so. > > thanks, > > greg k-h > We have been building and boot testing the mainline 4.9 tree for quite some time. This issue appears to be exposed by the rootfs that Linaro is using for testing; ours is incredibly simple (prints the version string then shuts down, there is no systemd or complex init). Some initial notes, I am not sure how much time I will have to look at this in the near future: 1. This does not happen with the same configuration file on linux-4.14.y. 2. This happens with the latest version of clang on linux-4.9.y. 3. Bisecting v4.9 to v4.14 will be rather difficult because clang support was backported to 4.9 somewhere in the 130s. There could be a clang backport missing or a bug was unintentionally fixed somewhere else. Cheers, Nathan
Re: [stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
On Thu, Nov 26, 2020 at 10:14:43AM +0530, Naresh Kamboju wrote: > Linaro recently started building and testing with stable branches with clang. > Stable 4.9 branch kernel built with clang 10 boot crashed on x86 and qemu_x86. > We do not have base line results to compare with. > > steps to build and boot: > # build kernel with tuxmake > # sudo pip3 install -U tuxmake > # tuxmake --runtime docker --target-arch x86 --toolchain clang-10 > --kconfig defconfig --kconfig-add > https://builds.tuxbuild.com/1kgtX7QEDmhvj6OfbZBdlGaEple/config > # boot qemu_x86_64 > # /usr/bin/qemu-system-x86_64 -cpu host -enable-kvm -nographic -net > nic,model=virtio,macaddr=DE:AD:BE:EF:66:14 -net tap -m 1024 -monitor > none -kernel kernel/bzImage --append "root=/dev/sda rootwait > console=ttyS0,115200" -hda > rootfs/rpb-console-image-lkft-intel-corei7-64-20201022181159-3085.rootfs.ext4 > -m 4096 -smp 4 -nographic > > Crash log: > --- > [ 14.121499] Freeing unused kernel memory: 1896K > [ 14.126962] random: fast init done > [ 14.206005] PANIC: double fault, error_code: 0x0 > [ 14.210633] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 > [ 14.216809] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > 2.2 05/23/2018 > [ 14.224196] task: 88026e2c task.stack: c902 > [ 14.230105] RIP: 0010:[] [] > proc_dostring+0x13b/0x1e0 > [ 14.238374] RSP: 0018:000c EFLAGS: 00010297 > [ 14.243676] RAX: 5638939fb850 RBX: 000c RCX: > 5638939fb850 > [ 14.250799] RDX: 000c RSI: RDI: > 007f > [ 14.257925] RBP: c9023d98 R08: c9023ef8 R09: > 5638939fb850 > [ 14.265049] R10: R11: 8117f9e0 R12: > 82479cf0 > [ 14.272171] R13: c9023ef8 R14: c9023dd8 R15: > 007f > [ 14.279298] FS: 7f57fbce8840() GS:88027788() > knlGS: > [ 14.287384] CS: 0010 DS: ES: CR0: 80050033 > [ 14.293120] CR2: fff8 CR3: 00026d58a000 CR4: > 00360670 > [ 14.300243] DR0: DR1: DR2: > > [ 14.307368] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 14.314491] Stack: > [ 14.316504] Call Trace: > [ 14.318955] Code: c3 49 8b 10 31 f6 48 01 da 49 89 10 49 83 3e 00 > 74 49 41 83 c7 ff 49 63 ff 4c 89 c9 0f 1f 40 00 48 39 fe 73 36 48 89 > c8 48 89 dc b0 9d 3a 00 85 c0 0f 85 8c 00 00 00 84 d2 74 1f 80 fa > 0a 74 > [ 14.338906] Kernel panic - not syncing: Machine halted. > [ 14.344123] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 > [ 14.350291] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > 2.2 05/23/2018 > [ 14.357677] 880277888e80 81518ae9 880277888e98 > 82971a10 > [ 14.365129] 000f 0086 > 820c5d57 > [ 14.372584] 880277888f08 81175736 0038 > 880277888f18 > [ 14.380038] Call Trace: > [ 14.382481] <#DF> [ 14.384406] [] > dump_stack+0xa9/0x100 > [ 14.389641] [] panic+0xe6/0x2a0 > [ 14.394432] [] df_debug+0x31/0x40 > [ 14.399389] [] do_double_fault+0x102/0x140 > [ 14.405128] [] double_fault+0x27/0x30 > [ 14.410440] [] ? proc_put_long+0xc0/0xc0 > [ 14.416004] [] ? proc_dostring+0x13b/0x1e0 > [ 14.421739] [ 14.423703] Kernel Offset: disabled > [ 14.427209] ---[ end Kernel panic - not syncing: Machine halted. > > Reported-by: Naresh Kamboju > > full test log, > https://lkft.validation.linaro.org/scheduler/job/1978901#L916 > https://lkft.validation.linaro.org/scheduler/job/1980839#L578 Is the mainline 4.9 tree supposed to work with clang? I didn't think that upstream effort started until 4.19 or so. thanks, greg k-h
[stable 4.9] PANIC: double fault, error_code: 0x0 - clang boot failed on x86_64
Linaro recently started building and testing with stable branches with clang. Stable 4.9 branch kernel built with clang 10 boot crashed on x86 and qemu_x86. We do not have base line results to compare with. steps to build and boot: # build kernel with tuxmake # sudo pip3 install -U tuxmake # tuxmake --runtime docker --target-arch x86 --toolchain clang-10 --kconfig defconfig --kconfig-add https://builds.tuxbuild.com/1kgtX7QEDmhvj6OfbZBdlGaEple/config # boot qemu_x86_64 # /usr/bin/qemu-system-x86_64 -cpu host -enable-kvm -nographic -net nic,model=virtio,macaddr=DE:AD:BE:EF:66:14 -net tap -m 1024 -monitor none -kernel kernel/bzImage --append "root=/dev/sda rootwait console=ttyS0,115200" -hda rootfs/rpb-console-image-lkft-intel-corei7-64-20201022181159-3085.rootfs.ext4 -m 4096 -smp 4 -nographic Crash log: --- [ 14.121499] Freeing unused kernel memory: 1896K [ 14.126962] random: fast init done [ 14.206005] PANIC: double fault, error_code: 0x0 [ 14.210633] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 [ 14.216809] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 14.224196] task: 88026e2c task.stack: c902 [ 14.230105] RIP: 0010:[] [] proc_dostring+0x13b/0x1e0 [ 14.238374] RSP: 0018:000c EFLAGS: 00010297 [ 14.243676] RAX: 5638939fb850 RBX: 000c RCX: 5638939fb850 [ 14.250799] RDX: 000c RSI: RDI: 007f [ 14.257925] RBP: c9023d98 R08: c9023ef8 R09: 5638939fb850 [ 14.265049] R10: R11: 8117f9e0 R12: 82479cf0 [ 14.272171] R13: c9023ef8 R14: c9023dd8 R15: 007f [ 14.279298] FS: 7f57fbce8840() GS:88027788() knlGS: [ 14.287384] CS: 0010 DS: ES: CR0: 80050033 [ 14.293120] CR2: fff8 CR3: 00026d58a000 CR4: 00360670 [ 14.300243] DR0: DR1: DR2: [ 14.307368] DR3: DR6: fffe0ff0 DR7: 0400 [ 14.314491] Stack: [ 14.316504] Call Trace: [ 14.318955] Code: c3 49 8b 10 31 f6 48 01 da 49 89 10 49 83 3e 00 74 49 41 83 c7 ff 49 63 ff 4c 89 c9 0f 1f 40 00 48 39 fe 73 36 48 89 c8 48 89 dc b0 9d 3a 00 85 c0 0f 85 8c 00 00 00 84 d2 74 1f 80 fa 0a 74 [ 14.338906] Kernel panic - not syncing: Machine halted. [ 14.344123] CPU: 1 PID: 1 Comm: systemd Not tainted 4.9.246-rc1 #2 [ 14.350291] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 14.357677] 880277888e80 81518ae9 880277888e98 82971a10 [ 14.365129] 000f 0086 820c5d57 [ 14.372584] 880277888f08 81175736 0038 880277888f18 [ 14.380038] Call Trace: [ 14.382481] <#DF> [ 14.384406] [] dump_stack+0xa9/0x100 [ 14.389641] [] panic+0xe6/0x2a0 [ 14.394432] [] df_debug+0x31/0x40 [ 14.399389] [] do_double_fault+0x102/0x140 [ 14.405128] [] double_fault+0x27/0x30 [ 14.410440] [] ? proc_put_long+0xc0/0xc0 [ 14.416004] [] ? proc_dostring+0x13b/0x1e0 [ 14.421739] [ 14.423703] Kernel Offset: disabled [ 14.427209] ---[ end Kernel panic - not syncing: Machine halted. Reported-by: Naresh Kamboju full test log, https://lkft.validation.linaro.org/scheduler/job/1978901#L916 https://lkft.validation.linaro.org/scheduler/job/1980839#L578 -- Linaro LKFT https://lkft.linaro.org