https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82418
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |uros at gcc dot gnu.org --- Comment #6 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (the 'divx' function in comment 5 does not implement division by 100) I'd like to see GCC improve here, so I looked at how this could be fixed. I'm afraid adjusting expand_divmod to select the cheaper alternative on x86 is going to be too complicated. I think it may be reasonable to conceal the 32x32 mul-highpart pattern on x86 from expand_divmod, so it uses the 32x32->64 widening multiply which leads to optimal code. I also think the 32x32 mul-highpart pattern is not very useful outside of magic division by constants, so concealing it altogether may be acceptable if no better solution is available. (to recap, we want 64-bit imul here rather than 32-bit widening mul with result in edx:eax, because imul has better latency and throughput, less regalloc constraints, and doesn't need a register to hold the immediate) Patch I'm testing to disallow 32x32 mul-highpart on 64-bit x86: --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -1042,6 +1042,10 @@ (define_mode_iterator SWIM248 [(HI "TARGET_HIMODE_MATH") (define_mode_iterator DWI [(DI "!TARGET_64BIT") (TI "TARGET_64BIT")]) +;; Widest single word integer modes. +(define_mode_iterator SWI48W [(SI "!TARGET_64BIT") + (DI "TARGET_64BIT")]) + ;; GET_MODE_SIZE for selected modes. As GET_MODE_SIZE is not ;; compile time constant, it is faster to use <MODE_SIZE> than ;; GET_MODE_SIZE (<MODE>mode). For XFmode which depends on @@ -7792,16 +7796,16 @@ (define_insn "*<u>mulqihi3_1" (set_attr "mode" "QI")]) (define_expand "<s>mul<mode>3_highpart" - [(parallel [(set (match_operand:SWI48 0 "register_operand") - (truncate:SWI48 + [(parallel [(set (match_operand:SWI48W 0 "register_operand") + (truncate:SWI48W (lshiftrt:<DWI> (mult:<DWI> (any_extend:<DWI> - (match_operand:SWI48 1 "nonimmediate_operand")) + (match_operand:SWI48W 1 "nonimmediate_operand")) (any_extend:<DWI> - (match_operand:SWI48 2 "register_operand"))) + (match_operand:SWI48W 2 "register_operand"))) (match_dup 3)))) - (clobber (match_scratch:SWI48 4)) + (clobber (match_scratch:SWI48W 4)) (clobber (reg:CC FLAGS_REG))])] "" "operands[3] = GEN_INT (GET_MODE_BITSIZE (<MODE>mode));")