https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #21 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Ken Jin from comment #15)
> I tested again this time with taskset, turbo boost off, on a quiet system,
> with PGO. These are the results. They're quite good:
> 
> # Indirect goto + LTO + PGO
> This machine benchmarks at 576728 pystones/second
> 
> # Tail calls, no preserve_none + LTO + PGO*
> This machine benchmarks at 539522 pystones/second
> 
> # Tail calls, preserve_none + LTO + PGO*
> This machine benchmarks at 572234 pystones/second
> 
> So roughly a 6-7% gain from preserve_none on the pystones benchmark over no
> preserve_none. Thanks again H.J. for the patch.
> 
> *PGO is disabled for tail calling functions in the bytecode interpreter, but
> enabled for everything else, as it seems PGO slows down those functions. I
> used the attributes `no_instrument_function,no_profile_instrument_function`
> to turn it off for the bytecode functions.
> 
> Something strange is going on with PGO for tail calls on my system. However,
> I can't figure it out right now.
> 
> Everything is benchmarked on this branch
> https://github.com/Fidget-Spinner/cpython/pull/new/Fidget-Spinner:cpython:
> tail-call-gcc-3

Hi Ken, my patch has been merged into GCC master branch.  Can you give it a
try?

Reply via email to