Re: More aggressive threading causing loop-interchange-9.c regression

Aldy Hernandez via Gcc Fri, 10 Sep 2021 09:05:46 -0700



On 9/10/21 5:43 PM, Jeff Law wrote:

On 9/9/2021 3:21 AM, Aldy Hernandez wrote:
   /* If this path does not thread through the loop latch, then we are
      using the FSM threader to find old style jump threads. This
      is good, except the FSM threader does not re-use an existing
      threading path to reduce code duplication.

      So for that case, drastically reduce the number of statements
      we are allowed to copy.  */
*blink*
Woah. The backward threader has been using FSM threadsindiscriminately as far as I can remember. I wonder what would breakif we "fixed it".
?!? I'm not sure what you're suggesting here. If you s/FSMthreader/backwards threader/ in the comment does it make more sense? Theterm FSM really should largely have been dropped as the backwardsthreader was improved to handle more cases.

back_threader_registry::register_path() uses EDGE_FSM_THREAD as thethread type to register threads. I was wondering if it should have beensome combination of EDGE_START_JUMP_THREAD / EDGE_*_COPY_SRC_BLOCK, etc.I (purposely) know nothing about the underlying threading types ;-).But if the backwards threader has been improved, then perhaps we shouldjust remove the confusing FSM references.

so these cases should use the "old style" validity/costing metricsand thus
classify threading opportunities in a different way?
Jeff, do you have any insight here?
This is precisely what you're cleaning up.
I think today "backwards" vs, "forwards" only refers to the way we find
threading opportunities.
Yes, it's a mess.
I ran some experiments a while back, and my current work on theenhanced solver/threader, can fold virtually everything theDOM/threader gets (even with its use of const_and_copies, avail_exprs,and evrp_range_analyzer), while getting 5% more DOM threads and 1%more overall threads. That is, I've been testing if the path solvercan solve everything the DOM threader needs (the hybrid approach Imentioned).
Unfortunately, replacing the forward threader right now is notfeasible for a few reasons:
Right. But I thought the short term goal was to replace/remove theforward threading from VRP. Dropping from DOM is going to be tougher.

My current thinking is that replacing the forward VRP threader with ahybrid one is a gentler approach to the longer term goal of replacingthe forward threader altogether. However, all the work I've been doingcould go either way-- we could try the forward/VRP replacement or ahybrid approach. It will all use the path solver underneath.

My main problem with replacing the forward/VRP with a backward client isthat the cost models are so different that it was difficult to comparehow we fared. I constantly ran into threads the solver could handlejust fine, but profitable_path_p was holding it back.

FWIW, we get virtually everything the forward threader gets, minus avery few things. At least when I plug in the solver to theDOM/forwarder threader, it can solve everything it can (minus noise andfloats).

If you prefer a backward threader instance to replace the VRP/forwardthreader, I'm game. It's just harder to compare. Either way (backwardthreader or a hybrid forward+solver) uses the same underlying solverwhich is solid.

a) The const_and_copies/avail_exprs relation framework can do floats,and that's next year's ranger work.
Right. I'd actually run into this as well when I wanted to drop all therange bits out of DOM and rely exclusively on EVRP. It'd still be astep forward to rip out the EVRP engine from DOM and simplify all thecode that derives one equivalence from another so that it's only workingon FP values.


Sure.

b) Even though we can seemingly fold everything DOM/threader does, inorder to replace it with a backward threader instance we'd have tomerge the cost/profitability code scattered throughout the forwardthreader, as well as the EDGE_FSM* / EDGE_COPY* business.
Right. This is a prerequisite. Though some of the costing will need tobe conditional on the threader being used. Refer back to the discussionaround how the forward threader can commonize thread paths that lead tothe same destination while the backwards threader can not.


Yup, yup.

c) DOM changes the IL as it goes. Though we could conceivably divorcedo the threading after DOM is done.
The only reason threading runs in parallel with DOM is so that it canuse the context sensitive equivalences. With the infrastructure you'rebuilding, there's a reasonable chance we can move to a model where werun DOM (and in the long term a simpler DOM) and threading as distinct,independent passes.

Andrew mumbled something about replacing all of DOM eventually :-).Well, except that value-numbering business I bet.


Aldy

Re: More aggressive threading causing loop-interchange-9.c regression

Reply via email to