Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-07 Thread Jeff Law
On Tue, 2020-08-04 at 13:16 +0100, Daniel P. Berrangé wrote:
> On Tue, Aug 04, 2020 at 12:02:05PM +0200, Florian Weimer wrote:
> > * Daniel P. Berrangé:
> > 
> > > Taken from https://koji.fedoraproject.org/koji/taskinfo?taskID=48525923
> > 
> > Sorry, what would be more interesting is the linker invocation.  The
> > build log does not show this, only the libtool invocation.  We don't
> > really know what kind of transformations libtool does in this case.
> 
> Upstream libvirt has just yesterday replaced use of autotools with
> meson. I just tried a Fedora rawhide build of our new meson based
> code and it succeeded with LTO.
> 
> Given this, I think I'm fine just disabling LTO in rawhide for the
> current libvirt release, with the expectation we'll re-enable LTO
> in ~1 month time when we import the meson based release of libvirt.
> 
> IOW lets not waste any more time debugging this LTO / LD_PRELOAD
> problem with libvirt.
That works for me.  If you haven't already committed the change, please make a
quite note about the planned move to meson and reenablement of LTO at that point
so I don't dig any further when I review all the opted-out packages.  I'm likely
to forget the decision by the time I'm reviewing the opt-outs.


> 
> > libtool is really not built for LTO, and it really should not be used on
> > GNU systems.  But I understand that this is not uncontroversial.
> There's oooh so many problems with libtool we've hit over the years,
> especially with it re-arranging order of compiler/linker flags, and
> so I'm beyond ecstatic that we've finally thrown it away for libvirt
> in favour of meson.
> 
> There may have been a time & place for libtool and autotools in
> general, but that time has passed
Yea, I find libtool amazingly painful and the primary motivation for it has long
since become a minor issue rather than major one (seriously who cares about
dynamic linking on hpux, sco, irix, etc) anymore.  But killing it I'm sure will
be difficult.

jeff
> 
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-04 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 12:02:05PM +0200, Florian Weimer wrote:
> * Daniel P. Berrangé:
> 
> > Taken from https://koji.fedoraproject.org/koji/taskinfo?taskID=48525923
> 
> Sorry, what would be more interesting is the linker invocation.  The
> build log does not show this, only the libtool invocation.  We don't
> really know what kind of transformations libtool does in this case.

Upstream libvirt has just yesterday replaced use of autotools with
meson. I just tried a Fedora rawhide build of our new meson based
code and it succeeded with LTO.

Given this, I think I'm fine just disabling LTO in rawhide for the
current libvirt release, with the expectation we'll re-enable LTO
in ~1 month time when we import the meson based release of libvirt.

IOW lets not waste any more time debugging this LTO / LD_PRELOAD
problem with libvirt.

> libtool is really not built for LTO, and it really should not be used on
> GNU systems.  But I understand that this is not uncontroversial.

There's oooh so many problems with libtool we've hit over the years,
especially with it re-arranging order of compiler/linker flags, and
so I'm beyond ecstatic that we've finally thrown it away for libvirt
in favour of meson.

There may have been a time & place for libtool and autotools in
general, but that time has passed

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-04 Thread Florian Weimer
* Daniel P. Berrangé:

> Taken from https://koji.fedoraproject.org/koji/taskinfo?taskID=48525923

Sorry, what would be more interesting is the linker invocation.  The
build log does not show this, only the libtool invocation.  We don't
really know what kind of transformations libtool does in this case.

libtool is really not built for LTO, and it really should not be used on
GNU systems.  But I understand that this is not uncontroversial.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Jeff Law
On Mon, 2020-08-03 at 17:26 +0100, Daniel P. Berrangé wrote:
> On Mon, Aug 03, 2020 at 05:34:47PM +0200, Hans de Goede wrote:
> > Hi,
> > 
> > On 8/3/20 5:27 PM, Daniel P. Berrangé wrote:
> > > On Mon, Aug 03, 2020 at 05:01:18PM +0200, Florian Weimer wrote:
> > > > * Daniel P. Berrangé:
> > > > 
> > > > > Disabling LTO in the RPM spec confirms this and makes things pass
> > > > > again. Hacking the makefiles to remove the -fno-lto option when
> > > > > building the test suite binaries also fixes things.
> > > > > 
> > > > > I don't see any mention of LD_PRELOAD being impacted by LTO in the
> > > > > Fedora feature change page, but I can imagine how it would be.
> > > > 
> > > > LTO should still export the same functions as before, and should not
> > > > imply -fno-semantic-interposition by default.  The linker plugin
> > > > provides the necessary information to GCC.  What you are observing could
> > > > be the result of a toolchain bug.
> > > 
> > > Libvirt has a test program "qemuhotplugtest".
> > > 
> > > This test links to a shared library  libqemutestdriver.so which contains
> > > a function "qemuProcessStartManagedPRDaemon".
> > > 
> > > qemuhotplugtest test does not call "qemuProcessStartManagedPRDaemon"
> > > directly. It invokes "qemuDomainAttachDeviceDiskLive" which eventually
> > > ends up calling "qemuProcessStartManagedPRDaemon" some way further
> > > down the stack.
> > > 
> > > 
> > > Then there is a shared library libqemuhotplugmock.so which contains a
> > > replacement "qemuProcessStartManagedPRDaemon" to avoid us spawning
> > > external programs.
> > > 
> > > When it starts "qemuhotplugtest" will set LD_PRELOAD=libqemuhotplugmock.so
> > > and then execve() itself.
> > > 
> > > So when the test runs, the "qemuProcessStartManagedPRDaemon" impl from
> > > the mock library is supposed to be used.
> > > 
> > > If I run with LD_DEBUG=all on a build /without/ LTO, I can see this lookup
> > > and override happening:
> > > 
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> > >  [0]
> > >  381018:  binding file 
> > > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> > >  [0] to 
> > > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> > >  [0]: normal symbol `qemuProcessStartManagedPRDaemon'
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirhostdevmock.so
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirpcimock.so
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libdomaincapsmock.so
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirprocessmock.so
> > >  [0]
> > >  381018:  symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
> > >  [0]
> > >  381018:  binding file 
> > > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> > >  [0] to 
> > > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
> > >  [0]: normal symbol `qemuProcessStartManagedPRDaemon'
> > > 
> > > 
> > > If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
> > > at all for qemuProcessStartManagedPRDaemon. It looks very much like the
> > > call was resolved and bound at link time when built with LTO.
> > 
> > Maybe it was not bound at link time, but inlined at compile time?
> > 
> > I think it might be worthwhile to try and mark the 
> > qemuProcessStartManagedPRDaemon
> > implementation which is used normally (no LD_PRELOAD) with some function
> > attribute that it may be never inlined. I'm sure Florian and/or Jakub
> > can help with what that function attribute should actually look like...
> 
> We usually mark APIs we mock with G_GNUC_NO_INLINE to prevent such
> problems. In this case we forgot

Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Daniel P . Berrangé
On Mon, Aug 03, 2020 at 05:40:42PM +0200, Florian Weimer wrote:
> * Daniel P. Berrangé:
> 
> > If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
> > at all for qemuProcessStartManagedPRDaemon. It looks very much like the
> > call was resolved and bound at link time when built with LTO.
> 
> It's possible that the symbol extraction logic is confused by LTO, but
> -ffat-lto-objects should prevent that.
> 
> Can you collect all the linker input scripts/command lines after libtool
> has done its thing?

For the qemuhotplugtest, this should be the gcc line it is running:

gcc -DHAVE_CONFIG_H -I. -I../../tests -I..  -I.. -I../.. -I../include 
-I../../include -I../src -I../../src -I../../src/util -I../../src/conf 
-I../../src/hypervisor -I../src/rpc   
-Dabs_builddir="\"/builddir/build/BUILD/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests\""
 
-Dabs_top_builddir="\"/builddir/build/BUILD/libvirt-6.5.0/x86_64-redhat-linux-gnu\""
 
-Dabs_srcdir="\"/builddir/build/BUILD/libvirt-6.5.0/x86_64-redhat-linux-gnu/../tests\""
 
-Dabs_top_srcdir="\"/builddir/build/BUILD/libvirt-6.5.0/x86_64-redhat-linux-gnu/..\""
 -I/usr/include/libxml2  -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include 
-pthread -I/usr/include/libmount -I/usr/include/blkid  -I/usr/include/libnl3  
-I/usr/include/libnl3  -I/usr/include/p11-kit-1   -I/usr/include/tirpc 
-fno-common -W -Wabsolute-value -Waddress -Waddress-of-packed-member 
-Waggressive-loop-optimizations -Wall -Wattribute-warning -Wattributes 
-Wbool-compare -Wbool-operation -Wbuiltin-declaration-mismatch 
-Wbuiltin-macro-redefined -Wcannot-profile -Wcast-align -Wcast-align=strict 
-Wcast-function-type -Wchar-subscripts -Wclobbered -Wcomment -Wcomments 
-Wcoverage-mismatch -Wcpp -Wdangling-else -Wdate-time -Wdeprecated-declarations 
-Wdesignated-init -Wdiscarded-array-qualifiers -Wdiscarded-qualifiers 
-Wdiv-by-zero -Wdouble-promotion -Wduplicated-cond -Wduplicate-decl-specifier 
-Wempty-body -Wendif-labels -Wexpansion-to-defined -Wextra 
-Wformat-contains-nul -Wformat-extra-args -Wformat-nonliteral -Wformat-security 
-Wformat-y2k -Wformat-zero-length -Wframe-address -Wfree-nonheap-object -Whsa 
-Wif-not-aligned -Wignored-attributes -Wignored-qualifiers -Wimplicit 
-Wimplicit-function-declaration -Wimplicit-int -Wincompatible-pointer-types 
-Winit-self -Winline -Wint-conversion -Wint-in-bool-context 
-Wint-to-pointer-cast -Winvalid-memory-model -Winvalid-pch 
-Wlogical-not-parentheses -Wlogical-op -Wmain -Wmaybe-uninitialized 
-Wmemset-elt-size -Wmemset-transposed-args -Wmisleading-indentation 
-Wmissing-attributes -Wmissing-braces -Wmissing-declarations 
-Wmissing-field-initializers -Wmissing-include-dirs -Wmissing-parameter-type 
-Wmissing-profile -Wmissing-prototypes -Wmultichar -Wmultistatement-macros 
-Wnarrowing -Wnested-externs -Wnonnull -Wnonnull-compare -Wnull-dereference 
-Wodr -Wold-style-declaration -Wold-style-definition -Wopenmp-simd -Woverflow 
-Woverride-init -Wpacked-bitfield-compat -Wpacked-not-aligned -Wparentheses 
-Wpointer-arith -Wpointer-compare -Wpointer-sign -Wpointer-to-int-cast 
-Wpragmas -Wpsabi -Wrestrict -Wreturn-local-addr -Wreturn-type 
-Wscalar-storage-order -Wsequence-point -Wshadow -Wshift-count-negative 
-Wshift-count-overflow -Wshift-negative-value -Wsizeof-array-argument 
-Wsizeof-pointer-div -Wsizeof-pointer-memaccess -Wstrict-aliasing 
-Wstrict-prototypes -Wstringop-truncation -Wsuggest-attribute=cold 
-Wsuggest-attribute=const -Wsuggest-attribute=format 
-Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wsuggest-final-methods 
-Wsuggest-final-types -Wswitch -Wswitch-bool -Wswitch-unreachable -Wsync-nand 
-Wtautological-compare -Wtrampolines -Wtrigraphs -Wtype-limits -Wuninitialized 
-Wunknown-pragmas -Wunused -Wunused-but-set-parameter -Wunused-but-set-variable 
-Wunused-function -Wunused-label -Wunused-local-typedefs -Wunused-parameter 
-Wunused-result -Wunused-value -Wunused-variable -Wvarargs -Wvariadic-macros 
-Wvector-operation-performance -Wvla -Wvolatile-register-var -Wwrite-strings 
-Walloc-size-larger-than=9223372036854775807 -Warray-bounds=2 
-Wattribute-alias=2 -Wformat-overflow=2 -Wformat-truncation=2 
-Wimplicit-fallthrough=5 -Wnormalized=nfc -Wshift-overflow=2 
-Wstringop-overflow=2 -Wunused-const-variable=2 -Wvla-larger-than=4031 
-Wno-sign-compare -Wno-cast-function-type -Wjump-misses-init -Wswitch-enum 
-Wno-format-nonliteral -Wno-format-truncation -Wframe-larger-than=4096 
-fstack-protector-strong -fexceptions -fasynchronous-unwind-tables 
-fipa-pure-const -Wno-suggest-attribute=pure -Wno-suggest-attribute=const 
-std=gnu99 -Wframe-larger-than=262144 -O2 -flto=auto -ffat-lto-objects 
-fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security 
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS 
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong 
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -mtune=generic 
-fasynchronous-unwind-tables -fstack-clash-protection -fcf-

Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Daniel P . Berrangé
On Mon, Aug 03, 2020 at 05:34:47PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 8/3/20 5:27 PM, Daniel P. Berrangé wrote:
> > On Mon, Aug 03, 2020 at 05:01:18PM +0200, Florian Weimer wrote:
> > > * Daniel P. Berrangé:
> > > 
> > > > Disabling LTO in the RPM spec confirms this and makes things pass
> > > > again. Hacking the makefiles to remove the -fno-lto option when
> > > > building the test suite binaries also fixes things.
> > > > 
> > > > I don't see any mention of LD_PRELOAD being impacted by LTO in the
> > > > Fedora feature change page, but I can imagine how it would be.
> > > 
> > > LTO should still export the same functions as before, and should not
> > > imply -fno-semantic-interposition by default.  The linker plugin
> > > provides the necessary information to GCC.  What you are observing could
> > > be the result of a toolchain bug.
> > 
> > Libvirt has a test program "qemuhotplugtest".
> > 
> > This test links to a shared library  libqemutestdriver.so which contains
> > a function "qemuProcessStartManagedPRDaemon".
> > 
> > qemuhotplugtest test does not call "qemuProcessStartManagedPRDaemon"
> > directly. It invokes "qemuDomainAttachDeviceDiskLive" which eventually
> > ends up calling "qemuProcessStartManagedPRDaemon" some way further
> > down the stack.
> > 
> > 
> > Then there is a shared library libqemuhotplugmock.so which contains a
> > replacement "qemuProcessStartManagedPRDaemon" to avoid us spawning
> > external programs.
> > 
> > When it starts "qemuhotplugtest" will set LD_PRELOAD=libqemuhotplugmock.so
> > and then execve() itself.
> > 
> > So when the test runs, the "qemuProcessStartManagedPRDaemon" impl from
> > the mock library is supposed to be used.
> > 
> > If I run with LD_DEBUG=all on a build /without/ LTO, I can see this lookup
> > and override happening:
> > 
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> >  [0]
> >  381018:binding file 
> > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> >  [0] to 
> > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> >  [0]: normal symbol `qemuProcessStartManagedPRDaemon'
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirhostdevmock.so
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirpcimock.so
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libdomaincapsmock.so
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirprocessmock.so
> >  [0]
> >  381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
> > file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
> >  [0]
> >  381018:binding file 
> > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
> >  [0] to 
> > /home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
> >  [0]: normal symbol `qemuProcessStartManagedPRDaemon'
> > 
> > 
> > If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
> > at all for qemuProcessStartManagedPRDaemon. It looks very much like the
> > call was resolved and bound at link time when built with LTO.
> 
> Maybe it was not bound at link time, but inlined at compile time?
> 
> I think it might be worthwhile to try and mark the 
> qemuProcessStartManagedPRDaemon
> implementation which is used normally (no LD_PRELOAD) with some function
> attribute that it may be never inlined. I'm sure Florian and/or Jakub
> can help with what that function attribute should actually look like...

We usually mark APIs we mock with G_GNUC_NO_INLINE to prevent such
problems. In this case we forgot to mark qemuProcessStartManagedPRDaemon
but it doesn't actually make a difference to the behaviour if we add the
missing G_GNUC_NO_INLINE annotation. I think the method impl is large enough
that the compiler would not consider

Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Florian Weimer
* Daniel P. Berrangé:

> If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
> at all for qemuProcessStartManagedPRDaemon. It looks very much like the
> call was resolved and bound at link time when built with LTO.

It's possible that the symbol extraction logic is confused by LTO, but
-ffat-lto-objects should prevent that.

Can you collect all the linker input scripts/command lines after libtool
has done its thing?

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Hans de Goede

Hi,

On 8/3/20 5:27 PM, Daniel P. Berrangé wrote:

On Mon, Aug 03, 2020 at 05:01:18PM +0200, Florian Weimer wrote:

* Daniel P. Berrangé:


Disabling LTO in the RPM spec confirms this and makes things pass
again. Hacking the makefiles to remove the -fno-lto option when
building the test suite binaries also fixes things.

I don't see any mention of LD_PRELOAD being impacted by LTO in the
Fedora feature change page, but I can imagine how it would be.


LTO should still export the same functions as before, and should not
imply -fno-semantic-interposition by default.  The linker plugin
provides the necessary information to GCC.  What you are observing could
be the result of a toolchain bug.


Libvirt has a test program "qemuhotplugtest".

This test links to a shared library  libqemutestdriver.so which contains
a function "qemuProcessStartManagedPRDaemon".

qemuhotplugtest test does not call "qemuProcessStartManagedPRDaemon"
directly. It invokes "qemuDomainAttachDeviceDiskLive" which eventually
ends up calling "qemuProcessStartManagedPRDaemon" some way further
down the stack.


Then there is a shared library libqemuhotplugmock.so which contains a
replacement "qemuProcessStartManagedPRDaemon" to avoid us spawning
external programs.

When it starts "qemuhotplugtest" will set LD_PRELOAD=libqemuhotplugmock.so
and then execve() itself.

So when the test runs, the "qemuProcessStartManagedPRDaemon" impl from
the mock library is supposed to be used.

If I run with LD_DEBUG=all on a build /without/ LTO, I can see this lookup
and override happening:

 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0]
 381018:binding file 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0] to 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0]: normal symbol `qemuProcessStartManagedPRDaemon'
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirhostdevmock.so
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirpcimock.so
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libdomaincapsmock.so
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirprocessmock.so
 [0]
 381018:symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
 [0]
 381018:binding file 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0] to 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
 [0]: normal symbol `qemuProcessStartManagedPRDaemon'


If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
at all for qemuProcessStartManagedPRDaemon. It looks very much like the
call was resolved and bound at link time when built with LTO.


Maybe it was not bound at link time, but inlined at compile time?

I think it might be worthwhile to try and mark the 
qemuProcessStartManagedPRDaemon
implementation which is used normally (no LD_PRELOAD) with some function
attribute that it may be never inlined. I'm sure Florian and/or Jakub
can help with what that function attribute should actually look like...

Regards,

Hans
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Daniel P . Berrangé
On Mon, Aug 03, 2020 at 05:01:18PM +0200, Florian Weimer wrote:
> * Daniel P. Berrangé:
> 
> > Disabling LTO in the RPM spec confirms this and makes things pass
> > again. Hacking the makefiles to remove the -fno-lto option when
> > building the test suite binaries also fixes things.
> >
> > I don't see any mention of LD_PRELOAD being impacted by LTO in the
> > Fedora feature change page, but I can imagine how it would be.
> 
> LTO should still export the same functions as before, and should not
> imply -fno-semantic-interposition by default.  The linker plugin
> provides the necessary information to GCC.  What you are observing could
> be the result of a toolchain bug.

Libvirt has a test program "qemuhotplugtest".

This test links to a shared library  libqemutestdriver.so which contains
a function "qemuProcessStartManagedPRDaemon".

qemuhotplugtest test does not call "qemuProcessStartManagedPRDaemon"
directly. It invokes "qemuDomainAttachDeviceDiskLive" which eventually
ends up calling "qemuProcessStartManagedPRDaemon" some way further
down the stack.


Then there is a shared library libqemuhotplugmock.so which contains a
replacement "qemuProcessStartManagedPRDaemon" to avoid us spawning
external programs.

When it starts "qemuhotplugtest" will set LD_PRELOAD=libqemuhotplugmock.so
and then execve() itself.

So when the test runs, the "qemuProcessStartManagedPRDaemon" impl from
the mock library is supposed to be used.

If I run with LD_DEBUG=all on a build /without/ LTO, I can see this lookup
and override happening:

381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0]
381018: binding file 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0] to 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0]: normal symbol `qemuProcessStartManagedPRDaemon'
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/lt-qemuhotplugtest
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirhostdevmock.so
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirpcimock.so
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libdomaincapsmock.so
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libvirprocessmock.so
 [0]
381018: symbol=qemuProcessStartManagedPRDaemon;  lookup in 
file=/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
 [0]
381018: binding file 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemutestdriver.so
 [0] to 
/home/berrange/src/fedora/libvirt/libvirt-6.5.0/x86_64-redhat-linux-gnu/tests/.libs/libqemuhotplugmock.so
 [0]: normal symbol `qemuProcessStartManagedPRDaemon'


If I run LD_DEBUG=all on a build /with/ LTO, there are no symbol lookups
at all for qemuProcessStartManagedPRDaemon. It looks very much like the
call was resolved and bound at link time when built with LTO.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Jakub Jelinek
On Mon, Aug 03, 2020 at 05:01:18PM +0200, Florian Weimer wrote:
> * Daniel P. Berrangé:
> 
> > Disabling LTO in the RPM spec confirms this and makes things pass
> > again. Hacking the makefiles to remove the -fno-lto option when
> > building the test suite binaries also fixes things.
> >
> > I don't see any mention of LD_PRELOAD being impacted by LTO in the
> > Fedora feature change page, but I can imagine how it would be.
> 
> LTO should still export the same functions as before, and should not
> imply -fno-semantic-interposition by default.  The linker plugin
> provides the necessary information to GCC.  What you are observing could
> be the result of a toolchain bug.

Yeah (unless it is Clang which only supports -fno-semantic-interposition and
not anything else.  LD_PRELOAD will simply not work with it).

Jakub
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Florian Weimer
* Daniel P. Berrangé:

> Disabling LTO in the RPM spec confirms this and makes things pass
> again. Hacking the makefiles to remove the -fno-lto option when
> building the test suite binaries also fixes things.
>
> I don't see any mention of LD_PRELOAD being impacted by LTO in the
> Fedora feature change page, but I can imagine how it would be.

LTO should still export the same functions as before, and should not
imply -fno-semantic-interposition by default.  The linker plugin
provides the necessary information to GCC.  What you are observing could
be the result of a toolchain bug.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


LTO vs LD_PRELOAD (libvirt FTBFS test suite failure)

2020-08-03 Thread Daniel P . Berrangé
I'm trying to understand failures in the libvirt test suite since the
Fedora rawhide mass rebuild.

Our test suite makes extensive use of mocking to replace functions in
the library being tested. We do this either by loading a LD_PRELOAD,
or by having the test program define a symbol with the same name as
the one in the library to replace it. It appears this is being
broken by LTO

Disabling LTO in the RPM spec confirms this and makes things pass
again. Hacking the makefiles to remove the -fno-lto option when
building the test suite binaries also fixes things.

I don't see any mention of LD_PRELOAD being impacted by LTO in the
Fedora feature change page, but I can imagine how it would be.

What is still confusing me is that 40+ of our test programs rely
on LD_PRELOAD, but only one of them actually broke from LTO. It
seems the LTO is inconsistent is how it affects the test binaries
in some way.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org