On 11/5/20 8:24 AM, Richard W.M. Jones wrote:
> We discovered a few days ago that LTO broke qemu on aarch64.
>
> The original bug reported was:
>
>   https://bugzilla.redhat.com/show_bug.cgi?id=1893892
>
> But actually looking at the build.log[1] we see another assertion
> failure in the test suite.  (Unfortunately although we run the test
> suite, the spec file was ignoring the result so the broken build
> escaped into Rawhide.)
>
> Because qemu is a complicated piece of software we're not clear if the
> bugs found are general bugs in LTO, bugs which are specific to LTO on
> aarch64, or bugs in qemu which are exposed by optimizations made
> possible by LTO.
>
> One thing we do suspect is that this could be the tip of the iceberg
> since the qemu test suite only tests a tiny fraction of the code.
>
> LTO has been disabled across all arches for now.  See the list of
> latest commits that Dan has added:
>
> https://src.fedoraproject.org/rpms/qemu/commits/master

Right.  And it's on my list to investigate (probably sometime in early
2021 given other pressing commitments).  FWIW, there's ~150 LTO opt-outs
on that list to investigate.  While I don't think we need to fix &
enable everything, I do want to thoroughly *understand* every issue and
get appropriate bugs filed.  FWIW, I have a Fedora spec file scanner
which notes all the opted-out packages so I know what needs investigation.


What I've typically seen for execution/testsuite failures have mostly
been package issues, not LTO issues.  Probably the biggest thing is that
LTO can inline across translation units.  So things like poorly written
ASMs that under-specified their dataflow suddenly get inlined and that
under-specification becomes critically important.  The other thing I've
seen a few times is ordering of static constructors for C++ -- LTO
can/will change that and code which relies on specific ordering (which
isn't defined) can fail.   On the LTO side, symbol visibility has been
the biggest headache and they can be insanely hard to track down.

I expect at least some of the opt outs, particularly the target
dependent ones are actually latent codegen issues that LTO happens to
expose, but aren't issues in LTO itself.

There's little doubt we'll find more issues as we work through the
opt-outs.  But that's also why we pushed so hard to get the vast
majority of things up with LTO in F33.  Soak time in Fedora is critical
in my mind.


jeff




.  LTO can change the ordering of static constructors for C++, or
aggressively inline across translation units exposing things like poorly
written asms.  We've seen a small number of (insanely annoying) issues
with symbol visibility in the LTO path itself.



I thought we had LTO disabled for QEMU until I could sit down and dig
into it?!?


jeff

_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
  • qemu & LTO Richard W.M. Jones
    • Re: qemu & LTO Jeff Law

Reply via email to