On Thu, Dec 7, 2023 at 1:20 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich <philipp.toms...@vrull.eu> > > wrote: > >> > >> On Wed, 6 Dec 2023 at 23:32, Richard Biener <richard.guent...@gmail.com> > >> wrote: > >> > > >> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis > >> > <manos.anagnosta...@vrull.eu> wrote: > >> > > > >> > > This is an RTL pass that detects store forwarding from stores to > >> > > larger loads (load pairs). > >> > > > >> > > This optimization is SPEC2017-driven and was found to be beneficial > >> > > for some benchmarks, > >> > > through testing on ampere1/ampere1a machines. > >> > > > >> > > For example, it can transform cases like > >> > > > >> > > str d5, [sp, #320] > >> > > fmul d5, d31, d29 > >> > > ldp d31, d17, [sp, #312] # Large load from small store > >> > > > >> > > to > >> > > > >> > > str d5, [sp, #320] > >> > > fmul d5, d31, d29 > >> > > ldr d31, [sp, #312] > >> > > ldr d17, [sp, #320] > >> > > > >> > > Currently, the pass is disabled by default on all architectures and > >> > > enabled by a target-specific option. > >> > > > >> > > If deemed beneficial enough for a default, it will be enabled on > >> > > ampere1/ampere1a, > >> > > or other architectures as well, without needing to be turned on by > >> > > this option. > >> > > >> > What is aarch64-specific about the pass? > >> > > >> > I see an increasingly large number of target specific passes pop up > >> > (probably > >> > for the excuse we can generalize them if necessary). But GCC isn't LLVM > >> > and this feels like getting out of hand? > >> > >> We had an OK from Richard Sandiford on the earlier (v5) version with > >> v6 just fixing an obvious bug... so I was about to merge this earlier > >> just when you commented. > >> > >> Given that this had months of test exposure on our end, I would prefer > >> to move this forward for GCC14 in its current form. > >> The project of replacing architecture-specific store-forwarding passes > >> with a generalized infrastructure could then be addressed in the GCC15 > >> timeframe (or beyond)? > > > > It's up to target maintainers, I just picked this pass (randomly) to make > > this > > comment (of course also knowing that STLF fails are a common issue on > > pipelined uarchs). > > I agree there's scope for making some of this target-independent. > > One vague thing I've been wondering about is whether, for some passes > like these, we should use inheritance rather than target hooks. So in > this case, the target-independent code would provide a framework for > iterating over the function and testing for forwarding, but the target > would ultimately decide what to do with that information. This would > also make it easier for targets to add genuinely target-specific > information to the bookkeeping structures. > > In case it sounds otherwise, that's supposed to be more than > just a structural C++-vs-C thing. The idea is that we'd have > a pass for "resolving store forwarding-related problems", > but the specific goals would be mostly (or at least partially) > target-specific rather than target-independent.
In some cases we've used target hooks for this, in this case it might work as well. > I'd wondered the same thing about the early-ra pass that we're > adding for SME. Some of the framework could be generalised and > made target-independent, but the main purpose of the pass (using > strided registers with certain patterns and constraints) is highly > target-specific. .. not sure about this one though. Richard. > Thanks, > Richard