[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 --- Comment #5 from Paul Eggert --- Also see bug 43 for a related performance issue, which is perhaps more important given the current state of bleeding-edge GNU diffutils.
[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #4 from Alexander Monakov --- It's a weakness in the REE pass. AFAICT normally it would handle this, but here there are two elimination candidates in 'main', the first is eliminated successfully, and then REE punts on the second because one if its reaching definitions is the first redundant extension: /* If def_insn is already scheduled to be deleted, don't attempt to modify it. */ if (state->modified[INSN_UID (def_insn)].deleted) return false; While looking into this I noticed that the fix for PR 61094 introduced a write-only bitfield 'do_not_reextend' (the Changelog wrongly claimed it was used).
[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 --- Comment #3 from Andrew Pinski --- The gimple level looks like: ``` if (_54 >= 0) goto ; [90.00%] else goto ; [10.00%] [local count: 63261141172]: _18 = (unsigned int) _54; goto ; [100.00%] ... len_37 = mbrtoc32 (, iter_39, _36, ); len.0_38 = (signed long) len_37; if (len.0_38 < 0) goto ; [10.00%] else goto ; [90.00%] [local count: 632611429]: ch.1_42 = ch; // Note this is a local variable [local count: 7029015815]: # SR.45_12 = PHI # SR.46_46 = PHI mbs ={v} {CLOBBER(eol)}; ch ={v} {CLOBBER(eol)}; [local count: 70290156974]: # SR.41_16 = PHI <_18(4), SR.45_12(7)> # SR.42_47 = PHI <1(4), SR.46_46(7)> _6 = (long long unsigned int) SR.41_16; ``` Maybe we should have a type promotion pass on the gimple level that promotes _54 to `long unsigned int`.
[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 --- Comment #2 from Paul Eggert --- Created attachment 55645 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55645=edit code-mbcel1.s with the optimization suggested in the bug report
[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement
[Bug rtl-optimization/110823] [missed optimization] >50% speedup for x86-64 ASCII processing a la GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110823 --- Comment #1 from Paul Eggert --- Created attachment 55644 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55644=edit gcc -O2 -S output (from code-mbcel1.i)