Re: Proposal: Remove Linux PGO Testing

Taras Glek Fri, 12 Oct 2012 12:35:21 -0700

On 10/11/2012 1:34 PM, Mike Hommey wrote:

On Thu, Oct 11, 2012 at 02:26:33PM -0400, Rafael Ávila de Espíndola wrote:

On 10/11/2012 02:33 AM, Mike Hommey wrote:

On Wed, Oct 10, 2012 at 05:57:53PM -0400, Justin Lebar wrote:

By "turning off Linux PGO testing", you really mean "stop making and
distributing Linux PGO builds," right?


The main reason I'd want Linux PGO is for mobile.  On desktop Linux,
most users (I expect) don't run our builds, so it's not a big deal if
they're some percent slower.


Many people have made claims about that at several different occasions.
Can we once and for all come up with actual data on that?

That being said, PGO on Linux is between 5 and 20% improvement on our
various talos tests. That's with the version of gcc we currently use,
which is 4.5. I'd expect 4.7 to do a better job even, especially if we
added lto to the equation (and since we are now building on x86-64
machines, we wouldn't have to worry about memory usage ; link time could
be a problem, though).

Also note that disabling PGO currently also means disabling the
optimizations we do on omni.ja (central directory optimizations and
reordering). This is somehow covered by bug 773171.


I wouldn't be surprised if most of the pgo benefit is because of bad
inline decisions by gcc. If we can narrow the gap by adding
MOZ_ALWAYS_INLINE, then maybe we can drop pgo.


A non-unsignificant part of the performance improvements PGO gives come
from code reordering to improve branch prediction. Presumably, we can
use NS_LIKELY/NS_UNLIKELY to improve some branches manually.


Theory:

PGO seems like the only way to balance speed vs size. GCC has the effectof compiling your hot code with -O3 and cold with -Os.Sure you can gain similar performance gains by aggressively inlining,but you are going to pay for that dearly in code size(and subsequentlystartup speed and to some degree resident memory usage).

One of the side-effects of being a large program is that code thatfrequently runs together has a good likelyhood of being spread aroundthe binary. This means large apps need larger caches to stay as fast assmaller apps. PGO offers a way out of this by letting the compiler groupwarm code in effect letting your large feature-laden app run as well asa nimbler more specialized one. Note I had no data to back this up otherthan my discussions with GCC devs.

Littering code with manual branchprediction + inlining seems like areally failure-prone unscalable alternative.


Practice:

In practice GCC PGO only has locality benefits at compilation unit levelwhich results in the same suboptimal locality when linked. Only when onethrows LTO into the mix is locality handled right. However we have notchecked recently if modern GCC is robust enough for our needs yet. Sothe main benefit of PGO atm is faster startup vs -O3 builds.

Since almost nobody uses Mozilla Firefox builds(and no Firefoxdisributors do pgo), it may not be that bad to hurt startup for a fewprecious Linux users.


Stary-eyed-future:

We need PGO + LTO to generate smallest-possible-fast code on mobile.Unfortunately I haven't not heard anything reassuring about ARM PGO/LTOin GCC. It's still likely to be broken as heck.


Taras

PS. Rafael, I'd be very happy to switch to clang if it implementedguided optimizations. I'd be much more tempted to invest resources infixing clang bugs in this area than fixing gcc ones.



_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Proposal: Remove Linux PGO Testing

Reply via email to