On 12/1/25 9:22 AM, Jakub Jelinek wrote:
On Mon, Dec 01, 2025 at 09:12:29AM -0700, Jeff Law wrote:
I noticed that GCC on x86-64 also misses this optimization, which
suggests this maybe generally beneficial for targets with expensive
multiplication. Given this, would you advise investigating a generic RTL
fix (e.g., in simplify-rtx.cc)?
We could try to make a case that the multiply, even though its a single op
is an exception to the simplistic cost model of gimple. I'm not sure that's
a great solution, but it's worth keeping in mind.
You could try to see if it could be improved at expansion time, we should be
able to see the expression as (a * b) == 0 due to TER and during expansion
we can query target costs and adjust the initial RTL we generate.
If it is a win on all targets, then perhaps lower it at isel time (or if
only on some of them, again decide at isel or expansion time what is
cheaper).
While we have traditionally thought of multiplies as expensive, 2c
multipliers should continue to become more common, though the silicon
cost may make duplicating them in every ALU too expensive. It's hard to
see how a 2c multiply is going to be worse than the comparison based
sequences, unless the multiplier is already busy.
Net, I don't think we can make a blanket statement about either form
being always better than the other.
Jeff