So is it OK for trunk as is in v6 with the generic changes added in GCC-15?
Manos. Στις Πέμ 7 Δεκ 2023, 16:10 ο χρήστης Richard Biener < richard.guent...@gmail.com> έγραψε: > On Thu, Dec 7, 2023 at 1:20 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: > > > > Richard Biener <richard.guent...@gmail.com> writes: > > > On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich < > philipp.toms...@vrull.eu> wrote: > > >> > > >> On Wed, 6 Dec 2023 at 23:32, Richard Biener < > richard.guent...@gmail.com> wrote: > > >> > > > >> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis > > >> > <manos.anagnosta...@vrull.eu> wrote: > > >> > > > > >> > > This is an RTL pass that detects store forwarding from stores to > larger loads (load pairs). > > >> > > > > >> > > This optimization is SPEC2017-driven and was found to be > beneficial for some benchmarks, > > >> > > through testing on ampere1/ampere1a machines. > > >> > > > > >> > > For example, it can transform cases like > > >> > > > > >> > > str d5, [sp, #320] > > >> > > fmul d5, d31, d29 > > >> > > ldp d31, d17, [sp, #312] # Large load from small store > > >> > > > > >> > > to > > >> > > > > >> > > str d5, [sp, #320] > > >> > > fmul d5, d31, d29 > > >> > > ldr d31, [sp, #312] > > >> > > ldr d17, [sp, #320] > > >> > > > > >> > > Currently, the pass is disabled by default on all architectures > and enabled by a target-specific option. > > >> > > > > >> > > If deemed beneficial enough for a default, it will be enabled on > ampere1/ampere1a, > > >> > > or other architectures as well, without needing to be turned on > by this option. > > >> > > > >> > What is aarch64-specific about the pass? > > >> > > > >> > I see an increasingly large number of target specific passes pop up > (probably > > >> > for the excuse we can generalize them if necessary). But GCC isn't > LLVM > > >> > and this feels like getting out of hand? > > >> > > >> We had an OK from Richard Sandiford on the earlier (v5) version with > > >> v6 just fixing an obvious bug... so I was about to merge this earlier > > >> just when you commented. > > >> > > >> Given that this had months of test exposure on our end, I would prefer > > >> to move this forward for GCC14 in its current form. > > >> The project of replacing architecture-specific store-forwarding passes > > >> with a generalized infrastructure could then be addressed in the GCC15 > > >> timeframe (or beyond)? > > > > > > It's up to target maintainers, I just picked this pass (randomly) to > make this > > > comment (of course also knowing that STLF fails are a common issue on > > > pipelined uarchs). > > > > I agree there's scope for making some of this target-independent. > > > > One vague thing I've been wondering about is whether, for some passes > > like these, we should use inheritance rather than target hooks. So in > > this case, the target-independent code would provide a framework for > > iterating over the function and testing for forwarding, but the target > > would ultimately decide what to do with that information. This would > > also make it easier for targets to add genuinely target-specific > > information to the bookkeeping structures. > > > > In case it sounds otherwise, that's supposed to be more than > > just a structural C++-vs-C thing. The idea is that we'd have > > a pass for "resolving store forwarding-related problems", > > but the specific goals would be mostly (or at least partially) > > target-specific rather than target-independent. > > In some cases we've used target hooks for this, in this case it might > work as well. > > > I'd wondered the same thing about the early-ra pass that we're > > adding for SME. Some of the framework could be generalised and > > made target-independent, but the main purpose of the pass (using > > strided registers with certain patterns and constraints) is highly > > target-specific. > > .. not sure about this one though. > > Richard. > > > Thanks, > > Richard >