Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
On Tue, 2016-11-10 at 07:47:56 UTC, Nicholas Piggin wrote: > On Mon, 10 Oct 2016 07:15:11 -0700 > Guenter Roeck wrote: > > > On 10/09/2016 10:49 PM, Nicholas Piggin wrote: > > > On Sun, 9 Oct 2016 08:21:21 -0700 > > > Guenter Roeck wrote: > > > > > >> Nicholas, > > >> > > >> some of my qemu tests for ppc64 started failing on mainline (and -next). > > >> You can find a test log at > > >> http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio > > >> > > >> The scripts to run the test are available at > > >> https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 > > >> > > >> Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate > > >> Alignment 0x600 > > >> interrupt"). Bisect log is attached. > > >> > > >> Since I don't have the means to run the code on a real system, I have no > > >> idea > > >> if the problem is caused by qemu or by the code. It is interesting, > > >> though, that > > >> only the 'mac99' tests are affected. > > >> > > >> Please let me know if there is anything I can do to help tracking down > > >> the > > >> problem. > > > > > > Thanks for this. That patch just moves a small amount of code, so it's > > > likely > > > that it's caused something to get placed out of range of its caller, or > > > the > > > linker started generating a stub for some reason. I can't immediately see > > > the > > > problem, but it could be specific to your exact toolchain. > > > > > > Something that might help, would you be able to put the compiled vmlinux > > > binaries > > > from before/after the bad patch somewhere I can grab them? > > > > > > > http://server.roeck-us.net/qemu/ppc64/mac99/ > > > > 'bad' is at f9aa67142ef26, 'good' is one commit earlier, 'tot' is from top > > of tree > > (b66484cd7470, more specifically). > > > > Key difference in System.map, from the bad case: > > > > c0005c00 T __end_interrupts > > c0007000 t end_virt_trampolines > > c0008000 t 0010.long_branch.power4_fixup_nap+0 > > c0008100 t fs_label > > c0008100 t start_text > > > > 0010.long_branch.power4_fixup_nap+0 does not exist in the good case, > > and fs_label/start_text are at c0008000. > > > > Toolchain is from poky 1.5.1, which uses gcc 4.8.1 and binutils 2.23.2. > > I also tried with the toolchain from poky 1.6, using gcc 4.8.2 and binutils > > 2.24, > > with the same result. > > Thank you for the quick response, this points to the exact problem. > I've attached a patch which should fix the bug. There are some checks > I've got planned that will catch this type of thing at build time and > be much easier to track down. > > Thanks, > Nick > > From: Nicholas Piggin > Date: Tue, 11 Oct 2016 18:33:26 +1100 > Subject: [PATCH] powerpc/64s: fix power4_fixup_nap placement > > power4_fixup_nap is called from the "common" handlers, not the virt/real > handlers, therefore it should itself be a common handler. Placing it > down in the trampoline space caused it to go out of reach of its > callers, requiring a trampoline inserted at the start of the text > section, which breaks the fixed section address calculations. > > Reported-by: Guenter Roeck > Signed-off-by: Nicholas Piggin Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/7c8cb4b50f3cc6f4a8f7bfddad6fb5 cheers
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
On 10/11/2016 12:47 AM, Nicholas Piggin wrote: On Mon, 10 Oct 2016 07:15:11 -0700 Guenter Roeck wrote: On 10/09/2016 10:49 PM, Nicholas Piggin wrote: On Sun, 9 Oct 2016 08:21:21 -0700 Guenter Roeck wrote: Nicholas, some of my qemu tests for ppc64 started failing on mainline (and -next). You can find a test log at http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio The scripts to run the test are available at https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment 0x600 interrupt"). Bisect log is attached. Since I don't have the means to run the code on a real system, I have no idea if the problem is caused by qemu or by the code. It is interesting, though, that only the 'mac99' tests are affected. Please let me know if there is anything I can do to help tracking down the problem. Thanks for this. That patch just moves a small amount of code, so it's likely that it's caused something to get placed out of range of its caller, or the linker started generating a stub for some reason. I can't immediately see the problem, but it could be specific to your exact toolchain. Something that might help, would you be able to put the compiled vmlinux binaries from before/after the bad patch somewhere I can grab them? http://server.roeck-us.net/qemu/ppc64/mac99/ 'bad' is at f9aa67142ef26, 'good' is one commit earlier, 'tot' is from top of tree (b66484cd7470, more specifically). Key difference in System.map, from the bad case: c0005c00 T __end_interrupts c0007000 t end_virt_trampolines c0008000 t 0010.long_branch.power4_fixup_nap+0 c0008100 t fs_label c0008100 t start_text 0010.long_branch.power4_fixup_nap+0 does not exist in the good case, and fs_label/start_text are at c0008000. Toolchain is from poky 1.5.1, which uses gcc 4.8.1 and binutils 2.23.2. I also tried with the toolchain from poky 1.6, using gcc 4.8.2 and binutils 2.24, with the same result. Thank you for the quick response, this points to the exact problem. I've attached a patch which should fix the bug. There are some checks I've got planned that will catch this type of thing at build time and be much easier to track down. Thanks, Nick From: Nicholas Piggin Date: Tue, 11 Oct 2016 18:33:26 +1100 Subject: [PATCH] powerpc/64s: fix power4_fixup_nap placement power4_fixup_nap is called from the "common" handlers, not the virt/real handlers, therefore it should itself be a common handler. Placing it down in the trampoline space caused it to go out of reach of its callers, requiring a trampoline inserted at the start of the text section, which breaks the fixed section address calculations. Reported-by: Guenter Roeck Signed-off-by: Nicholas Piggin Yes, that works. Tested-by: Guenter Roeck --- arch/powerpc/kernel/exceptions-64s.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 08992f8..f129408 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1377,7 +1377,7 @@ __end_interrupts: DEFINE_FIXED_SYMBOL(__end_interrupts) #ifdef CONFIG_PPC_970_NAP -TRAMP_REAL_BEGIN(power4_fixup_nap) +EXC_COMMON_BEGIN(power4_fixup_nap) andcr9,r9,r10 std r9,TI_LOCAL_FLAGS(r11) ld r10,_LINK(r1) /* make idle task do the */
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
On Mon, 10 Oct 2016 07:15:11 -0700 Guenter Roeck wrote: > On 10/09/2016 10:49 PM, Nicholas Piggin wrote: > > On Sun, 9 Oct 2016 08:21:21 -0700 > > Guenter Roeck wrote: > > > >> Nicholas, > >> > >> some of my qemu tests for ppc64 started failing on mainline (and -next). > >> You can find a test log at > >> http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio > >> > >> The scripts to run the test are available at > >> https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 > >> > >> Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment > >> 0x600 > >> interrupt"). Bisect log is attached. > >> > >> Since I don't have the means to run the code on a real system, I have no > >> idea > >> if the problem is caused by qemu or by the code. It is interesting, > >> though, that > >> only the 'mac99' tests are affected. > >> > >> Please let me know if there is anything I can do to help tracking down the > >> problem. > > > > Thanks for this. That patch just moves a small amount of code, so it's > > likely > > that it's caused something to get placed out of range of its caller, or the > > linker started generating a stub for some reason. I can't immediately see > > the > > problem, but it could be specific to your exact toolchain. > > > > Something that might help, would you be able to put the compiled vmlinux > > binaries > > from before/after the bad patch somewhere I can grab them? > > > > http://server.roeck-us.net/qemu/ppc64/mac99/ > > 'bad' is at f9aa67142ef26, 'good' is one commit earlier, 'tot' is from top of > tree > (b66484cd7470, more specifically). > > Key difference in System.map, from the bad case: > > c0005c00 T __end_interrupts > c0007000 t end_virt_trampolines > c0008000 t 0010.long_branch.power4_fixup_nap+0 > c0008100 t fs_label > c0008100 t start_text > > 0010.long_branch.power4_fixup_nap+0 does not exist in the good case, > and fs_label/start_text are at c0008000. > > Toolchain is from poky 1.5.1, which uses gcc 4.8.1 and binutils 2.23.2. > I also tried with the toolchain from poky 1.6, using gcc 4.8.2 and binutils > 2.24, > with the same result. Thank you for the quick response, this points to the exact problem. I've attached a patch which should fix the bug. There are some checks I've got planned that will catch this type of thing at build time and be much easier to track down. Thanks, Nick From: Nicholas Piggin Date: Tue, 11 Oct 2016 18:33:26 +1100 Subject: [PATCH] powerpc/64s: fix power4_fixup_nap placement power4_fixup_nap is called from the "common" handlers, not the virt/real handlers, therefore it should itself be a common handler. Placing it down in the trampoline space caused it to go out of reach of its callers, requiring a trampoline inserted at the start of the text section, which breaks the fixed section address calculations. Reported-by: Guenter Roeck Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 08992f8..f129408 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1377,7 +1377,7 @@ __end_interrupts: DEFINE_FIXED_SYMBOL(__end_interrupts) #ifdef CONFIG_PPC_970_NAP -TRAMP_REAL_BEGIN(power4_fixup_nap) +EXC_COMMON_BEGIN(power4_fixup_nap) andcr9,r9,r10 std r9,TI_LOCAL_FLAGS(r11) ld r10,_LINK(r1) /* make idle task do the */ -- 2.9.3
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
On 10/09/2016 10:49 PM, Nicholas Piggin wrote: On Sun, 9 Oct 2016 08:21:21 -0700 Guenter Roeck wrote: Nicholas, some of my qemu tests for ppc64 started failing on mainline (and -next). You can find a test log at http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio The scripts to run the test are available at https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment 0x600 interrupt"). Bisect log is attached. Since I don't have the means to run the code on a real system, I have no idea if the problem is caused by qemu or by the code. It is interesting, though, that only the 'mac99' tests are affected. Please let me know if there is anything I can do to help tracking down the problem. Thanks for this. That patch just moves a small amount of code, so it's likely that it's caused something to get placed out of range of its caller, or the linker started generating a stub for some reason. I can't immediately see the problem, but it could be specific to your exact toolchain. Something that might help, would you be able to put the compiled vmlinux binaries from before/after the bad patch somewhere I can grab them? http://server.roeck-us.net/qemu/ppc64/mac99/ 'bad' is at f9aa67142ef26, 'good' is one commit earlier, 'tot' is from top of tree (b66484cd7470, more specifically). Key difference in System.map, from the bad case: c0005c00 T __end_interrupts c0007000 t end_virt_trampolines c0008000 t 0010.long_branch.power4_fixup_nap+0 c0008100 t fs_label c0008100 t start_text 0010.long_branch.power4_fixup_nap+0 does not exist in the good case, and fs_label/start_text are at c0008000. Toolchain is from poky 1.5.1, which uses gcc 4.8.1 and binutils 2.23.2. I also tried with the toolchain from poky 1.6, using gcc 4.8.2 and binutils 2.24, with the same result. Guenter
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
Hi Michael, On 10/09/2016 11:00 PM, Michael Ellerman wrote: Guenter Roeck writes: Nicholas, some of my qemu tests for ppc64 started failing on mainline (and -next). You can find a test log at http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio The scripts to run the test are available at https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment 0x600 interrupt"). Bisect log is attached. Since I don't have the means to run the code on a real system, I have no idea if the problem is caused by qemu or by the code. It is interesting, though, that only the 'mac99' tests are affected. Thanks for the report. I do test qemu mac99 as part of my CI, and it's booting OK, so there must be some config difference? I'm using pmac32_defconfig plus: CONFIG_LOCALVERSION_AUTO=y CONFIG_SERIAL_PMACZILOG=y CONFIG_SERIAL_PMACZILOG_TTYS=n CONFIG_SERIAL_PMACZILOG_CONSOLE=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y And running with: $ qemu-system-ppc -nographic -vga none -M mac99 -m 1G -kernel vmlinux -initrd $HOME/static/ppc32-initrd.gz -append "console=ttyPZ0 init=/bin/sh" I am building a 64bit image, and my test runs with qemu-system-ppc64. My 32bit tests for mac99 are fine as well. Guenter
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
Guenter Roeck writes: > Nicholas, > > some of my qemu tests for ppc64 started failing on mainline (and -next). > You can find a test log at > http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio > > The scripts to run the test are available at > https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 > > Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment > 0x600 > interrupt"). Bisect log is attached. > > Since I don't have the means to run the code on a real system, I have no idea > if the problem is caused by qemu or by the code. It is interesting, though, > that > only the 'mac99' tests are affected. Thanks for the report. I do test qemu mac99 as part of my CI, and it's booting OK, so there must be some config difference? I'm using pmac32_defconfig plus: CONFIG_LOCALVERSION_AUTO=y CONFIG_SERIAL_PMACZILOG=y CONFIG_SERIAL_PMACZILOG_TTYS=n CONFIG_SERIAL_PMACZILOG_CONSOLE=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y And running with: $ qemu-system-ppc -nographic -vga none -M mac99 -m 1G -kernel vmlinux -initrd $HOME/static/ppc32-initrd.gz -append "console=ttyPZ0 init=/bin/sh" cheers
Re: ppc64 qemu test failure since commit f9aa67142 ("powerpc/64s: Consolidate Alignment 0x600 interrupt")
On Sun, 9 Oct 2016 08:21:21 -0700 Guenter Roeck wrote: > Nicholas, > > some of my qemu tests for ppc64 started failing on mainline (and -next). > You can find a test log at > http://kerneltests.org/builders/qemu-ppc64-master/builds/580/steps/qemubuildcommand/logs/stdio > > The scripts to run the test are available at > https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64 > > Bisect points to commit f9aa67142ef26 ("powerpc/64s: Consolidate Alignment > 0x600 > interrupt"). Bisect log is attached. > > Since I don't have the means to run the code on a real system, I have no idea > if the problem is caused by qemu or by the code. It is interesting, though, > that > only the 'mac99' tests are affected. > > Please let me know if there is anything I can do to help tracking down the > problem. Thanks for this. That patch just moves a small amount of code, so it's likely that it's caused something to get placed out of range of its caller, or the linker started generating a stub for some reason. I can't immediately see the problem, but it could be specific to your exact toolchain. Something that might help, would you be able to put the compiled vmlinux binaries from before/after the bad patch somewhere I can grab them? Thanks, Nick