On Mon, Nov 25, 2024 at 3:28 AM Philipp Tomsich
<philipp.toms...@vrull.eu> wrote:
>
> Pushed to master with the following fixups:
>   - new timevar added
>   - nits addressed
>   - whitespace fixes

The pass seems to be disabled by default everywhere - I thought we
decided to avoid adding
passes like this because they tend to bit-rot quickly and become a
maintenance burden.

What was the plan here?

Richard.

> Philipp.
>
>
> On Mon, 25 Nov 2024 at 03:30, Jeff Law <jeffreya...@gmail.com> wrote:
> >
> >
> >
> > On 11/9/24 2:48 AM, Konstantinos Eleftheriou wrote:
> > > From: kelefth <konstantinos.elefther...@vrull.eu>
> > >
> > > This pass detects cases of expensive store forwarding and tries to avoid 
> > > them
> > > by reordering the stores and using suitable bit insertion sequences.
> > > For example it can transform this:
> > >
> > >       strb    w2, [x1, 1]
> > >       ldr     x0, [x1]      # Expensive store forwarding to larger load.
> > >
> > > To:
> > >
> > >       ldr     x0, [x1]
> > >       strb    w2, [x1]
> > >       bfi     x0, x2, 0, 8
> > >
> > > Assembly like this can appear with bitfields or type punning / unions.
> > > On stress-ng when running the cpu-union microbenchmark the following 
> > > speedups
> > > have been observed.
> > >
> > >    Neoverse-N1:      +29.4%
> > >    Intel Coffeelake: +13.1%
> > >    AMD 5950X:        +17.5%
> > >
> > > The transformation is rejected on cases that would cause store_bit_field
> > > to generate subreg expressions on different register classes.
> > > Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c contain
> > > such cases and have been marked as XFAIL.
> > >
> > > There is a special handling for machines with BITS_BIG_ENDIAN !=
> > > BYTES_BIG_ENDIAN. The need for this came up from an issue in H8
> > > architecture, which uses big-endian ordering, but BITS_BIG_ENDIAN
> > > is false. In that case, the START parameter of store_bit_field
> > > needs to be calculated from the end of the destination register.
> > >
> > > gcc/ChangeLog:
> > >
> > >       * Makefile.in (OBJS): Add avoid-store-forwarding.o.
> > >       * common.opt (favoid-store-forwarding): New option.
> > >       * common.opt.urls: Regenerate.
> > >       * doc/invoke.texi: New param store-forwarding-max-distance.
> > >       * doc/passes.texi: Document new pass.
> > >       * doc/tm.texi: Regenerate.
> > >       * doc/tm.texi.in: Document new pass.
> > >       * params.opt (store-forwarding-max-distance): New param.
> > >       * passes.def: Add pass_rtl_avoid_store_forwarding before
> > >       pass_early_remat.
> > >       * target.def (avoid_store_forwarding_p): New DEFHOOK.
> > >       * target.h (struct store_fwd_info): Declare.
> > >       * targhooks.cc (default_avoid_store_forwarding_p): New function.
> > >       * targhooks.h (default_avoid_store_forwarding_p): Declare.
> > >       * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
> > >       * avoid-store-forwarding.cc: New file.
> > >       * avoid-store-forwarding.h: New file.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >       * gcc.target/aarch64/avoid-store-forwarding-1.c: New test.
> > >       * gcc.target/aarch64/avoid-store-forwarding-2.c: New test.
> > >       * gcc.target/aarch64/avoid-store-forwarding-3.c: New test.
> > >       * gcc.target/aarch64/avoid-store-forwarding-4.c: New test.
> > >       * gcc.target/aarch64/avoid-store-forwarding-5.c: New test.
> > >       * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c: New 
> > > test.
> > >          * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c: New 
> > > test.
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
> > > Signed-off-by: Konstantinos Eleftheriou 
> > > <konstantinos.elefther...@vrull.eu>
> > >
> > > Series-version: 8
> > >
> > > Series-changes: 8
> > >       - Fix store_bit_field call for big-endian targets, where
> > >         BITS_BIG_ENDIAN is false.
> > >       - Handle store_forwarding_max_distance = 0 as a special case that
> > >         disables cost checks for avoid-store-forwarding.
> > >       - Update testcases for AArch64 and add testcases for x86-64.
> > >
> > > Series-changes: 7
> > >       - Fix bug when copying back the load register, in the case that the
> > >         load is eliminated.
> > >
> > > Series-changes: 6
> > >       - Reject the transformation on cases that would cause 
> > > store_bit_field
> > >         to generate subreg expressions on different register classes.
> > >         Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c
> > >            contain such cases and have been marked as XFAIL.
> > >       - Use optimize_bb_for_speed_p instead of optimize_insn_for_speed_p.
> > >       - Inline and remove get_load_mem.
> > >       - New implementation for is_store_forwarding.
> > >       - Refactor the main loop in avoid_store_forwarding.
> > >       - Avoid using the word 'forwardings'.
> > >       - Use lowpart_subreg instead of validate_subreg + gen_rtx_subreg.
> > >       - Don't use df_insn_rescan where not needed.
> > >       - Change order of emitting stores and bit insert instructions.
> > >       - Check and reject loads for which the dest register overlaps with 
> > > src.
> > >       - Remove unused variable.
> > >       - Change some gen_mov_insn function calls to gen_rtx_SET.
> > >       - Subtract the cost of eliminated load, instead of 1, for the total 
> > > cost.
> > >       - Use delete_insn instead of set_insn_deleted.
> > >       - Regenerate common.opt.urls.
> > >       - Add some more comments.
> > >
> > > Series-changes: 5
> > >       - Fix bug with BIG_ENDIAN targets.
> > >       - Fix bug with unrecognized instructions.
> > >       - Fix / simplify pass init/fini.
> > >
> > > Series-changes: 4
> > >       - Change pass scheduling to run after sched1.
> > >       - Add target hook to decide whether a store forwarding instance
> > >         should be avoided or not.
> > >       - Fix bugs.
> > >
> > > Series-changes: 3
> > >       - Only emit SUBREG after calling validate_subreg.
> > >       - Fix memory corruption due to vec self-reference.
> > >       - Fix bitmap_bit_in_range_p ICE due to BLKMode.
> > >       - Reject MEM to MEM sets.
> > >       - Add get_load_mem comment.
> > >       - Add new testcase.
> > >
> > > Series-changes: 2
> > >       - Allow modes that are not scalar_int_mode.
> > >       - Introduce simple costing to avoid unprofitable transformations.
> > >       - Reject bit insert sequences that spill to memory.
> > >       - Document new pass.
> > >       - Fix and add testcases.
> > > ---
> >
> > > +namespace {
> > > +
> > > +const pass_data pass_data_avoid_store_forwarding =
> > > +{
> > > +  RTL_PASS, /* type.  */
> > > +  "avoid_store_forwarding", /* name.  */
> > > +  OPTGROUP_NONE, /* optinfo_flags.  */
> > > +  TV_NONE, /* tv_id.  */
> > > +  0, /* properties_required.  */
> > > +  0, /* properties_provided.  */
> > > +  0, /* properties_destroyed.  */
> > > +  0, /* todo_flags_start.  */
> > > +  TODO_df_finish /* todo_flags_finish.  */
> > > +};
> > Probably want a TV entry for store forwarding.  While it shouldn't be a
> > big time sink, we've seen other passes that should be efficient go nuts
> > in some cases (extension elimination being the most recent example).
> >
> >
> >
> >
> > > +
> > > +/* Try to modify BB so that expensive store forwarding cases are 
> > > avoided.  */
> > > +
> > > +void store_forwarding_analyzer::avoid_store_forwarding (basic_block bb)
> >
> > Formatting nit.  Put the return type on its own line so that the
> > function name always starts on column 0 of its own line.
> >
> >
> >
> > > +
> > > +/* Update pass statistics.  */
> > > +
> > > +void store_forwarding_analyzer::update_stats (function *fn)
> > Similarly.
> >
> >
> > OK with the new timevar and two formatting nits.  Thanks for your
> > patience on this.
> >
> > Jeff
> >
> >

Reply via email to