Re: checking command to parse /usr/bin/nm -B output from gcc object... failed
On 2020-01-08, Martin Liška wrote: > On 1/7/20 10:40 PM, Nick Bowler wrote: >> Regardless, $global_symbol_pipe is part of the documented libtool >> interface, which says you can do: >> >>eval "$NM progname | $global_symbol_pipe" >> >> This is obviously busted because the failed configure test leads to >> global_symbol_pipe='' which will obviously cause problems in this >> usage (I just tested one of my scripts and yup, it is busted). > > Yes, that's what I see for many package failures in openSUSE when I enable > -fno-common in optimization flags: Interestingly, I noticed that the dlpreopen support bits seems to actually have code to handle the case where global_symbol_pipe is empty (and appears to make an effort to display a big fat warning message that the feature is unlikely to work, but allows the user to proceed anyway). However the -export-symbol-regex implementation uses global_symbol_pipe without any check so if global_symbol_pipe is empty then you get shell syntax errors when using this feature. In principle a big fat warning message would be OK for this feature as well, falling back to a no-op. The libtool documentation does not seem to mention an empty global_symbol_pipe as a possibility so I expect any users of this feature will not check this (this was the case with my script). The libtool configure test could be improved so that the test for global_symbol_pipe does not depend on global_symbol_to_cdecl working. Cheers, Nick
Re: checking command to parse /usr/bin/nm -B output from gcc object... failed
On 1/8/20 10:16 AM, Martin Liška wrote: On 1/7/20 9:47 PM, Nick Bowler wrote: On 1/7/20, Martin Liška wrote: nm -B detection fails to be detected with -flto and -fno-common CFLAGS: configure:6307: checking command to parse /usr/bin/nm -B output from gcc object [...] configure:6536: gcc -o conftest -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -fno-common -flto -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -fno-common -flto conftest.c conftstm.o >&5 conftest.c:18:12: error: variable 'nm_test_var' redeclared as function 18 | extern int nm_test_var(); | ^ conftest.c:4:6: note: previously declared here 4 | relocations are performed -- see ld's documentation on pseudo-relocs. */ | ^ lto1: fatal error: errors during merging of translation units compilation terminated. As seen, I bet problem is in conftstm.o file (for which I can't see how it's created). Probably the file is missing declaration of nm_test_var and so that it's implicitly deduced to be int nm_test_var(). nm_test_var is defined as: char nm_test_var; The test works by running nm on this file, parsing the output, and then generating a C file which inserts references to all the exported symbols it finds. Then this is linked with the original file. With a "normal" compiler and -fno-common nm_test_var goes in BSS and will be marked "B" in the nm output. Hello. However, LTO breaks nm really badly and with -fno-common this variable gets marked as "T" in the nm output. Thank you for identification of the root cause. I've just created a nm issue for that: https://sourceware.org/bugzilla/show_bug.cgi?id=25355 So apparently it's a known limitation of the LTO plugin. Question is whether we can somehow workaround that? Martin That will fix the problem, so let's see. Thanks, Martin So it is indistinguishable from functions and when the C file is generated, a function declaration for nm_test_var is emitted (if it was correctly marked "B", then a variable declaration will be emitted). It's really unfortunate that LTO breaks nm in this way. But even if this configure test didn't fail I suspect subsequent users of $global_symbol_pipe will expect nm to work properly and it won't. I'm not 100% sure which libtool features will be affected by this configuration failure. It doesn't fatally stop the configure script. Probably dlpreopen won't work at all? It's also unfortunate that since there is no way to directly reference symbol values in standard C, a common way to do so is with dummy array or function declarations, and lo and behold LTO apparently breaks this too... Cheers, Nick
Re: checking command to parse /usr/bin/nm -B output from gcc object... failed
On 1/7/20 10:40 PM, Nick Bowler wrote: On 1/7/20, Bob Friesenhahn wrote: On Tue, 7 Jan 2020, Nick Bowler wrote: On 1/7/20, Martin Liška wrote: nm -B detection fails to be detected with -flto and -fno-common CFLAGS: I don't know what vintage this documentation is (the copyright says it is from 2020 so it seems to be the latest), but the page at https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html says this about "-fcommon" "The default is -fno-common, which specifies..." GCC 9.2 documentation says that the default is target dependent, which suggests that some targets use no-common by default. I think the fact that this test produces a common symbol most of the time, and that nm happens to work under LTO in this specific case, is mostly just a happy accident. Well, the nm is using LTO plugin, so it should properly communicate symbol types (in ideal world) :) I'm not 100% sure which libtool features will be affected by this configuration failure. It doesn't fatally stop the configure script. Probably dlpreopen won't work at all? Are there many users of dlpreopen()? I imagine there are users of -dlopen, which is supposed to automatically fall back to dlpreopen when shared library support is not available (for example, if the user configures the package with --disable-shared). Whether or not developers routinely test that their packages work with shared libraries disabled is another matter. Regardless, $global_symbol_pipe is part of the documented libtool interface, which says you can do: eval "$NM progname | $global_symbol_pipe" This is obviously busted because the failed configure test leads to global_symbol_pipe='' which will obviously cause problems in this usage (I just tested one of my scripts and yup, it is busted). Yes, that's what I see for many package failures in openSUSE when I enable -fno-common in optimization flags: But more importantly I suspect the actual busted feature is $global_symbol_to_cdecl, which is supposed to produce declarations for the symbols you get from global_symbol_pipe. This is clearly not working under LTO as it fails to distinguish functions and variables. It might be possible to detect this case in configure and come up with a symbol declaration that works for both functions and data, which might enable global_symbol_to_cdecl to generate working declarations, and would probably fix this configure test and typical usage scenarios like dlpreopen. It's also unfortunate that since there is no way to directly reference symbol values in standard C, a common way to do so is with dummy array or function declarations, and lo and behold LTO apparently breaks this too... LTO often causes strange issues. It needs to be used with care. Thus far I have seen LTO reduce the output executable size (sometimes substantially if there is a lot of "dead" code) but I have not seen a speed benefit to properly written code. When I last played around with LTO on my C code I was hoping to achieve reduced executable size but I found the results to be almost exactly the same as what I was already getting by compiling everything with -ffunction-sections -fdata-sections and then linking with -Wl,--gc-sections. And unlike LTO, those options don't break nm which would have required a massive amount of futzing with the build system to get things to even work. I can provide quite some interesting numbers about usage of LTO (ideally with PGO): http://hubicka.blogspot.com/2019/05/gcc-9-link-time-and-inter-procedural.html Or if you want to compare SPEC numbers: https://lnt.opensuse.org/db_default/v4/SPEC/spec_report/branch In both scenarios LTO brings both speed up and size reduction. And note that we enabled LTO in openSUSE Tumbleweed by default. Martin Cheers, Nick
Re: checking command to parse /usr/bin/nm -B output from gcc object... failed
On 1/7/20 10:07 PM, Bob Friesenhahn wrote: On Tue, 7 Jan 2020, Nick Bowler wrote: On 1/7/20, Martin Liška wrote: nm -B detection fails to be detected with -flto and -fno-common CFLAGS: I don't know what vintage this documentation is (the copyright says it is from 2020 so it seems to be the latest), but the page at https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html says this about "-fcommon" Yes, this one is for current master and we always document option that is _NOT_ a default. That's why you'll see documented -fno-common in GCC 9.2.0 manual: https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Code-Gen-Options.html Martin "The default is -fno-common, which specifies..." GCC 9.2 documentation says that the default is target dependent, which suggests that some targets use no-common by default. I'm not 100% sure which libtool features will be affected by this configuration failure. It doesn't fatally stop the configure script. Probably dlpreopen won't work at all? Are there many users of dlpreopen()? It's also unfortunate that since there is no way to directly reference symbol values in standard C, a common way to do so is with dummy array or function declarations, and lo and behold LTO apparently breaks this too... LTO often causes strange issues. It needs to be used with care. Thus far I have seen LTO reduce the output executable size (sometimes substantially if there is a lot of "dead" code) but I have not seen a speed benefit to properly written code. Bob
Re: checking command to parse /usr/bin/nm -B output from gcc object... failed
On 1/7/20 9:47 PM, Nick Bowler wrote: On 1/7/20, Martin Liška wrote: nm -B detection fails to be detected with -flto and -fno-common CFLAGS: configure:6307: checking command to parse /usr/bin/nm -B output from gcc object [...] configure:6536: gcc -o conftest -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -fno-common -flto -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -fno-common -flto conftest.c conftstm.o >&5 conftest.c:18:12: error: variable 'nm_test_var' redeclared as function 18 | extern int nm_test_var(); |^ conftest.c:4:6: note: previously declared here 4 |relocations are performed -- see ld's documentation on pseudo-relocs. */ | ^ lto1: fatal error: errors during merging of translation units compilation terminated. As seen, I bet problem is in conftstm.o file (for which I can't see how it's created). Probably the file is missing declaration of nm_test_var and so that it's implicitly deduced to be int nm_test_var(). nm_test_var is defined as: char nm_test_var; The test works by running nm on this file, parsing the output, and then generating a C file which inserts references to all the exported symbols it finds. Then this is linked with the original file. With a "normal" compiler and -fno-common nm_test_var goes in BSS and will be marked "B" in the nm output. Hello. However, LTO breaks nm really badly and with -fno-common this variable gets marked as "T" in the nm output. Thank you for identification of the root cause. I've just created a nm issue for that: https://sourceware.org/bugzilla/show_bug.cgi?id=25355 That will fix the problem, so let's see. Thanks, Martin So it is indistinguishable from functions and when the C file is generated, a function declaration for nm_test_var is emitted (if it was correctly marked "B", then a variable declaration will be emitted). It's really unfortunate that LTO breaks nm in this way. But even if this configure test didn't fail I suspect subsequent users of $global_symbol_pipe will expect nm to work properly and it won't. I'm not 100% sure which libtool features will be affected by this configuration failure. It doesn't fatally stop the configure script. Probably dlpreopen won't work at all? It's also unfortunate that since there is no way to directly reference symbol values in standard C, a common way to do so is with dummy array or function declarations, and lo and behold LTO apparently breaks this too... Cheers, Nick