Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
Hi Andi, On Mon, Nov 13, 2017 at 10:52:27AM -0800, Andi Kleen wrote: > It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line unable > to see the inlining information, > > Using OPTIMIZE_INLINING is fine. Good to know that! It works for me. Perhaps your binutils is too old? It was added at some point. Can you try upgrading? % ./linux/scripts/faddr2line obj/vmlinux schedule+10 schedule+10/0x80: schedule at arch/x86/include/asm/current.h:15 % addr2line --version GNU addr2line version 2.27-24.fc26 I use debian and tried addr2line in 2 systems: GNU addr2line (GNU Binutils for Debian) 2.28 GNU addr2line (GNU Binutils for Debian) 2.29.1 Regards, Fengguang
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
> I wonder if there is some way to use the split format for the > intermediate files, but then for the very final link bring them all in > and make the end result be a traditional single binary. I'm not > talking the separate "dwp" package that packs multiple dwo files into > one, but to actually link them all in at the one. > > Sure, it would lose some of the advantage, but I think a large portion > of the -gsplit-dwarf advantage is about the intermediate state. No? Not sure it's worth to do complicated workarounds. I assume it will be not that difficult to fix binutils (after all gdb works), and disabling the option is a reasonable workaround. > > I tried to google for it, but couldn't find anything. But apparently > elfutils doesn't support dwo files either. It seems mainly the linker > and gdb itself that supports it. The original design document is https://gcc.gnu.org/wiki/DebugFission -Andi
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, Nov 13, 2017 at 1:41 PM, Andi Kleen wrote: > > It seems to be broken for normal programs too Ok. I wonder if there is some way to use the split format for the intermediate files, but then for the very final link bring them all in and make the end result be a traditional single binary. I'm not talking the separate "dwp" package that packs multiple dwo files into one, but to actually link them all in at the one. Sure, it would lose some of the advantage, but I think a large portion of the -gsplit-dwarf advantage is about the intermediate state. No? I tried to google for it, but couldn't find anything. But apparently elfutils doesn't support dwo files either. It seems mainly the linker and gdb itself that supports it. Linus
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, Nov 13, 2017 at 12:56:31PM -0800, Linus Torvalds wrote: > On Mon, Nov 13, 2017 at 12:10 PM, Andi Kleen wrote: > > > > You're right. It works for line information, but strangely not for > > inlines. I assume it can be fixed. > > So I'm not 100% sure it's strictly a addr2line bug. It seems to be broken for normal programs too $ cat tinline.c int i; static inline int finline(void) { i++; } main() { finline(); } $ gcc -O2 -gsplit-dwarf tinline.c $ addr2line -i -e a.out 0x4003b0 /home/ak/tsrc/tinline.c:6 $ gcc -O2 -g tinline.c $ addr2line -i -e a.out 0x4003b0 /home/ak/tsrc/tinline.c:6 /home/ak/tsrc/tinline.c:12 $ I filed https://sourceware.org/bugzilla/show_bug.cgi?id=22434 -Andi
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, Nov 13, 2017 at 12:10 PM, Andi Kleen wrote: > > You're right. It works for line information, but strangely not for > inlines. I assume it can be fixed. So I'm not 100% sure it's strictly a addr2line bug. We do more than just link the vmlinux file. There's that whole complicated script for our final link in scripts/link-vmlinux.sh which links in all the kallsyms information etc, but also does things like the exception table sorting. So it's possible that addr2line works fine on a normal executable, but that we've done some munging that then makes it not really do the right thing for vmlinux. But it is also equally possible that addr2line simply doesn't understand -gsplit-dwarf at all. I've never seen it used outside the kernel. Linus
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, Nov 13, 2017 at 12:10 PM, Andi Kleen wrote: >> Put another way: the CONFIG_DEBUG_INFO_SPLIT option is useless. Yes, >> it saves time and disk space, but does so at the expense of making all >> the debug information unavailable to basic tools. > > You're right. It works for line information, but strangely not for > inlines. I assume it can be fixed. > Is there a binutils bug report for this? -- H.J.
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
> Put another way: the CONFIG_DEBUG_INFO_SPLIT option is useless. Yes, > it saves time and disk space, but does so at the expense of making all > the debug information unavailable to basic tools. You're right. It works for line information, but strangely not for inlines. I assume it can be fixed. -Andi
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, Nov 13, 2017 at 10:52 AM, Andi Kleen wrote: > > It works for me. No it clearly does not. > % ./linux/scripts/faddr2line obj/vmlinux schedule+10 > schedule+10/0x80: > schedule at arch/x86/include/asm/current.h:15 That's obviously garbage and the problem. Just look at it. It claims it is in "schedule()", at line 15 in the file arch/x86/include/asm/current.h. It's bullshit, and it's almost entirely useless, because while you see which line it was, you have no idea how it got there, which can be a big problem particularly with the trivial inlines. _Which_ of the different invocations of some atomic update was it? The _real_ output is supposed to be like this: [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314 __schedule+0x314/0x840: rq_sched_info_arrive at kernel/sched/stats.h:12 (inlined by) sched_info_arrive at kernel/sched/stats.h:99 (inlined by) __sched_info_switch at kernel/sched/stats.h:151 (inlined by) sched_info_switch at kernel/sched/stats.h:158 (inlined by) prepare_task_switch at kernel/sched/core.c:2582 (inlined by) context_switch at kernel/sched/core.c:2755 (inlined by) __schedule at kernel/sched/core.c:3366 See how now it knows that __schedule is in kernel/sched/core.c, and how it has inlined things and just how it ended up in that header file and which inline function it was. So your addr2line is equally broken, and doesn't give the right information. Put another way: the CONFIG_DEBUG_INFO_SPLIT option is useless. Yes, it saves time and disk space, but does so at the expense of making all the debug information unavailable to basic tools. Linus
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
> > It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line unable > > to see the inlining information, > > > > Using OPTIMIZE_INLINING is fine. > > Good to know that! It works for me. Perhaps your binutils is too old? It was added at some point. Can you try upgrading? % ./linux/scripts/faddr2line obj/vmlinux schedule+10 schedule+10/0x80: schedule at arch/x86/include/asm/current.h:15 % addr2line --version GNU addr2line version 2.27-24.fc26 -Andi
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
[...] > Oh - and talking about "big step forward" - does the 0day robot do > any > suspend/resume testing at all? Yes, we do. CC Rui and Aaron on power testing. yes, we have added suspend/resume test in 0day, including both functionality and suspend/resume performance. It is not widely run because most of the 0Day testboxes are servers/desktops, now we've just added some client laptops as testboxes, and will add more in the near future. :) > > Even on non-laptop hardware, it should be possible to do something > like > > echo platform > /sys/power/pm_test > echo freeze > /sys/power/state > > or similar (assuming CONFIG_PM_DEBUG is enabled). > yes. I will run native suspend/resume test on laptops and other test boxes that really support it, and run suspend/resume test in pm_test modes on the others to help us find more issues. It's a good plan, thanks! Client devices can be much cheaper than servers. They have more diversities in HW while being more general available. On the other hand, if there are PM functionalities that can be tested inside QEMU, it'll be good to have. Since no real HW can be tested as cheap and extensive as the large amount of VMs. Thanks, Fengguang
Re: CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
On Mon, 2017-11-13 at 09:13 +0800, Fengguang Wu wrote: > CC Andi and more DEBUG_INFO_SPLIT people. > > On Sun, Nov 12, 2017 at 11:31:56AM -0800, Linus Torvalds wrote: > > > > On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu > m> wrote: > > > > > > > > > OK. Here is the original faddr2line output: > > > > > > $ ~/linux/scripts/faddr2line vmlinux > > > vlan_device_event+0x7f5/0xa40 > > > vlan_device_event+0x7f5/0xa40: > > > vlan_device_event at net/8021q/vlan.h:60 > > > > > > And below is call trace embedded with full faddr2line output. > > > > > > I notice that this trace shows no additional inline files at all. > > > Is it because I did some kconfig option wrong, so that inline > > > info is > > > lost? Eg. > > > > > > CONFIG_OPTIMIZE_INLINING=y (it looks better set to N) > > > CONFIG_DEBUG_INFO_REDUCED=y > > > CONFIG_DEBUG_INFO_SPLIT=y > > Ok, this annoyed me, so I went back and looked. > > > > It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line > > unable > > to see the inlining information, > > > > Using OPTIMIZE_INLINING is fine. > Good to know that! > > > > > I'm not sure that addr2line could be made to understand the .dwo > > files > > that DEBUG_INFO_SPLIT causes (particularly since we munge the > > vmlinux > > file itself, who knows how that could confuse things). > > > > So can I ask that you make the 0day build scripts always use > > > > CONFIG_DEBUG_INFO=y > > CONFIG_DEBUG_INFO_REDUCED=y > > # CONFIG_DEBUG_INFO_SPLIT is not set > > > > because with that "DEBUG_INFO_REDUCED=y", the use of > > DEBUG_INFO_SPLIT > > shouldn't be _that_ big of a deal. > > > > Yes, splitting the debug info does help reduce disk usage for the > > build, and presumably speed it up a bit too due to less IO and > > reduced > > copying of the debug info data, but right now it really makes the > > debug info much less useful. > Yes DEBUG_INFO_SPLIT helps reduce build cost. Equally importantly, > it helps cut down the *.ko sizes, which saves boot test cost, too. > Since in our test scheme, the below modules.cgz will be loaded as > part > of initrd on boot testing. Which will cost memory, and to the lesser > degree, IO and uncompressing time. > > Here is the diff of the modules.cgz size: > > Big files under /pkg/linux/x86_64-rhel- > 7.2+CONFIG_DEBUG_INFO_REDUCED/gcc-6/v4.14-rc7/, > comparing to +CONFIG_DEBUG_INFO_SPLIT: > > =>54M 135M modules.cgz > 7.3M 7.3M vmlinuz-4.14.0-rc7 > 1.2M 1.2M linux-headers.cgz > 7.6M 7.7M linux-selftests.cgz > 31M 31M linux-perf.cgz > > Nevertheless, that's machine cost. If DEBUG_INFO_SPLIT hurts our > ability to analyze bugs, I think the forthright way would be to > disable it in our tests. > > > > > Just to see the difference: > > > > - with DEBUG_INFO_SPLIT=y > > > > [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux > > __schedule+0x314 > > __schedule+0x314/0x840: > > __schedule at kernel/sched/stats.h:12 > > > > - with DEBUG_INFO_SPLIT is not set > > > > [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux > > __schedule+0x314 > > __schedule+0x314/0x840: > > rq_sched_info_arrive at kernel/sched/stats.h:12 > > (inlined by) sched_info_arrive at kernel/sched/stats.h:99 > > (inlined by) __sched_info_switch at kernel/sched/stats.h:151 > > (inlined by) sched_info_switch at kernel/sched/stats.h:158 > > (inlined by) prepare_task_switch at kernel/sched/core.c:2582 > > (inlined by) context_switch at kernel/sched/core.c:2755 > > (inlined by) __schedule at kernel/sched/core.c:3366 > > > > and while (once again) this is a pretty extreme case, we do use a > > lot > > of inlines, and gcc will add its own inlining. Getting this whole > > information - particularly for the faulting IP - would really help > > in > > some situations. > > > > I love what the 0day robot is doing, this would be another big step > > forward. > Thank you for the helpful information and appreciations! > I'll make the change to disable DEBUG_INFO_SPLIT. > > > > > Oh - and talking about "big step forward" - does the 0day robot do > > any > > suspend/resume testing at all? > Yes, we do. CC Rui and Aaron on power testing. > yes, we have added suspend/resume test in 0day, including both functionality and suspend/resume performance. It is not widely run because most of the 0Day testboxes are servers/desktops, now we've just added some client laptops as testboxes, and will add more in the near future. :) > > > > Even on non-laptop hardware, it should be possible to do something > > like > > > > echo platform > /sys/power/pm_test > > echo freeze > /sys/power/state > > > > or similar (assuming CONFIG_PM_DEBUG is enabled). > > yes. I will run native suspend/resume test on laptops and other test boxes that really support it, and run suspend/resume test in pm_test modes on the others to help us find more issues. thanks, rui > > Maybe you already do something like this? > Rui/Aaron have better knowledge on the current
CONFIG_DEBUG_INFO_SPLIT impacts on faddr2line
CC Andi and more DEBUG_INFO_SPLIT people. On Sun, Nov 12, 2017 at 11:31:56AM -0800, Linus Torvalds wrote: On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu wrote: OK. Here is the original faddr2line output: $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 And below is call trace embedded with full faddr2line output. I notice that this trace shows no additional inline files at all. Is it because I did some kconfig option wrong, so that inline info is lost? Eg. CONFIG_OPTIMIZE_INLINING=y (it looks better set to N) CONFIG_DEBUG_INFO_REDUCED=y CONFIG_DEBUG_INFO_SPLIT=y Ok, this annoyed me, so I went back and looked. It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line unable to see the inlining information, Using OPTIMIZE_INLINING is fine. Good to know that! I'm not sure that addr2line could be made to understand the .dwo files that DEBUG_INFO_SPLIT causes (particularly since we munge the vmlinux file itself, who knows how that could confuse things). So can I ask that you make the 0day build scripts always use CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_REDUCED=y # CONFIG_DEBUG_INFO_SPLIT is not set because with that "DEBUG_INFO_REDUCED=y", the use of DEBUG_INFO_SPLIT shouldn't be _that_ big of a deal. Yes, splitting the debug info does help reduce disk usage for the build, and presumably speed it up a bit too due to less IO and reduced copying of the debug info data, but right now it really makes the debug info much less useful. Yes DEBUG_INFO_SPLIT helps reduce build cost. Equally importantly, it helps cut down the *.ko sizes, which saves boot test cost, too. Since in our test scheme, the below modules.cgz will be loaded as part of initrd on boot testing. Which will cost memory, and to the lesser degree, IO and uncompressing time. Here is the diff of the modules.cgz size: Big files under /pkg/linux/x86_64-rhel-7.2+CONFIG_DEBUG_INFO_REDUCED/gcc-6/v4.14-rc7/, comparing to +CONFIG_DEBUG_INFO_SPLIT: =>54M 135M modules.cgz 7.3M 7.3M vmlinuz-4.14.0-rc7 1.2M 1.2M linux-headers.cgz 7.6M 7.7M linux-selftests.cgz 31M 31M linux-perf.cgz Nevertheless, that's machine cost. If DEBUG_INFO_SPLIT hurts our ability to analyze bugs, I think the forthright way would be to disable it in our tests. Just to see the difference: - with DEBUG_INFO_SPLIT=y [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314 __schedule+0x314/0x840: __schedule at kernel/sched/stats.h:12 - with DEBUG_INFO_SPLIT is not set [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314 __schedule+0x314/0x840: rq_sched_info_arrive at kernel/sched/stats.h:12 (inlined by) sched_info_arrive at kernel/sched/stats.h:99 (inlined by) __sched_info_switch at kernel/sched/stats.h:151 (inlined by) sched_info_switch at kernel/sched/stats.h:158 (inlined by) prepare_task_switch at kernel/sched/core.c:2582 (inlined by) context_switch at kernel/sched/core.c:2755 (inlined by) __schedule at kernel/sched/core.c:3366 and while (once again) this is a pretty extreme case, we do use a lot of inlines, and gcc will add its own inlining. Getting this whole information - particularly for the faulting IP - would really help in some situations. I love what the 0day robot is doing, this would be another big step forward. Thank you for the helpful information and appreciations! I'll make the change to disable DEBUG_INFO_SPLIT. Oh - and talking about "big step forward" - does the 0day robot do any suspend/resume testing at all? Yes, we do. CC Rui and Aaron on power testing. Even on non-laptop hardware, it should be possible to do something like echo platform > /sys/power/pm_test echo freeze > /sys/power/state or similar (assuming CONFIG_PM_DEBUG is enabled). Maybe you already do something like this? Rui/Aaron have better knowledge on the current status. It does look an error-prone area that's worth more testing efforts. Anyway, regardless this was a good release for the 0day robot. Thanks. My (and our) pleasure. I'd like to thank you and all the people who take time to analyze/fix the bugs. It's great to see the long standing bugs being fixed in mainline -- they have been a big source of noises that hurt our auto bisect&reporting capabilities. Regards, Fengguang