On Fri, 29 Nov 2024, Jakub Jelinek wrote: > On Fri, Nov 29, 2024 at 09:19:55AM +0100, Richard Biener wrote: > > For a TSVC testcase we see failed register coalescing due to a > > different schedule of GIMPLE .FMA and stores fed by it. This > > can be mitigated by making direct internal functions participate > > in TER - given we're using more and more of such functions to > > expose target capabilities it seems to be a natural thing to not > > exempt those. > > > > Unfortunately the internal function expanding API doesn't match > > what we usually have - passing in a target and returning an RTX > > but instead the LHS of the call is expanded and written to. This > > makes the TER expansion of a call SSA def a bit unwieldly. > > Can't we change that? > Especially if it is only for the easiest subset of internal fns > (I see you limit it only to direct_internal_fn_p), if it has just > one or a couple of easy implementations, those could be split into > one which handles the whole thing by just expanding lhs and calling > another function with the rtx target argument into which to store > stuff (or const0_rtx for ignored result?) and handle the actual expansion, > and then have an exported function from internal-fn.cc which expr.cc > could call for the TERed internal-fn case. > That function could assert it is only direct_internal_fn_p or some > other subset which it would handle.
The expander goes through macro-generated expand_FOO (see top of internal-fn.cc), and in the end dispatches to expand_*_optab_fn of which there is a generic one for UNARY, BINARY and TERNARY but very many OPTAB_NAME variants, like expand_fold_len_extract_optab_fn dispatching to expand_direct_optab_fn or complex ones like expand_gather_load_optab_fn. There's unfortunately no good way to factor out a different API there, at least not easily. Suggestions welcome, of course. Richard.