On 07/16/2015 01:32 PM, Dmitry Grinberg wrote: > WUMUL x, y which will multiply 32-bit register x by 32-bit > register y, and produce a 64-bit result, storing the high bits into > register x and low bits into register y
You can rewrite the RTL to make this easier. You can use a parallel to do for instance [(set (reg:SI x) (truncate:SI (lshiftrt:DI (mult:DI (sign_extend:DI (reg:SI x)) (sign_extend:DI (reg:SI y))) (const_int 32))) (set (reg:SI y) (truncate:SI (mult:DI (sign_extend:DI (reg:SI x)) (sign_extend:DI (reg:SI y)))))] Now you have only 32-bit regs, and you can use matching constraints to make it work. The truncate lshiftrt is the traditional way to write a mulX_highpart pattern. Some parts of the optimizer may recognize this construct and know how to handle it. For the second set, you might consider just using (mult:SI ...) if that gives the correct result, or you can use a subreg or whatever. The optimizer is unlikely to generate this pattern on its own, but you can have an expander and/or splitter that generates it. Use zero_extend instead of sign_extend if this is an unsigned widening multiply. You probably want to generate two SImode temporaries in the expander, and copy the input regs into the temporaries, as expanders aren't supposed to clobber input regs. If you want a 64-bit result out of this, then you would need extra instructions to combine x and y into a 64-bit output. Another way to do this is to arbtrarily force the result into a register pair, then you can use a subreg to match the high part or the low part of that register pair for the inputs. [(set (reg:DI x) (mult:DI (sign_extend:DI (subreg:SI (reg:DI x) 0)) (sign_extend:DI (subreg:SI (reg:DI x) 1))))] The subreg numbers may be reversed if this is little word endian instead of big word endian. You might need extra setup instructions to create the register pair first. Create a DI temp for the output, move the inputs into the high/low word of the DI temp, and then you can do the multiply on the DI tmep. Jim