On Wed, Oct 19, 2022 at 5:44 AM Jeff Law via Gcc <gcc@gcc.gnu.org> wrote: > > > On 10/18/22 20:09, Vineet Gupta wrote: > > > > On 10/18/22 16:36, Jeff Law wrote: > >>>> There isn't a great place in GCC to handle this right now. If the > >>>> constraints were relaxed in PRE, then we'd have a chance, but > >>>> getting the cost model right is going to be tough. > >>> > >>> It would have been better (for this specific case) if loop unrolling > >>> was not being done so early. The tree pass cunroll is flattening it > >>> out and leaving for rest of the all tree/rtl passes to pick up the > >>> pieces and remove any redundancies, if at all. It obviously needs to > >>> be early if we are injecting 7x more instructions, but seems like a > >>> lot to unravel. > >> > >> Yup. If that loop gets unrolled, it's going to be a mess. It will > >> almost certainly make this problem worse as each iteration is going > >> to have a pair of constants loaded and no good way to remove them. > > > > Thats the original problem that I started this thread with. I'd > > snipped the disassembly as it would have been too much text but > > basically on RV, Coremark crc8 loop of const 8 iterations gets > > unrolled including extraneous 8 insns pairs to load the same constant > > - which is preposterous. Other arches side-step by using if-conversion > > / cond moves, latter currently WIP in RV International. x86 w/o > > if-convert seems OK since the const can be encoded in the xor insn. > > > > OTOH given that gimple/tree-pass cunroll is doing the culprit loop > > unrolling and introducing redundant const 8 times, can it ne addressed > > there somehow. > > tree_estimate_loop_size() seems to identify constant expression, not > > just an operand. Can it be taught to identify a "non-trivial const" > > and hoist/code-move the expression. Sorry just rambling here, most > > likely non-sense.
On GIMPLE all constants are "simple". > Oh, cunroll. There might be a distinct flag for complete unrolling. At -O3 we peel completely, there's no flag to disable that. > I really expect something like Click's work is the way forward. > Essentially when you VN the function you'll identify those constants and > collapse them all down to a single instance. Then the GCM phase will > kick in and find a place to put the evaluation so that you have one and > only one. I'd say postreload gcse would be a place to do that. At least when there's no available hardreg CSEing likely isn't going to be a win. > Some of Bodik's work might catch it as well, though implementing his > ideas is likely a lot more work. > > > Jeff