On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > Many thanks for your review of the "double word logical operation clean-up" > patch. > The revision below incorporates the majority of your feedback, but with one > or two > exceptions (required to allow the patch to bootstrap) that I thought I'd > double check > with you before pushing. > > Firstly, great catch that we no longer need to test rtx_equal (operands[0], > operands[1]) > when moving a splitter from before reload to after reload, as this is > guaranteed by the > "0" constraints. I've cleaned this up in all the doubleword splitters > (including the > <any_or> case that's now moved). Also, as you've suggested, this patch uses > a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split > post-reload) > and the other for !TARGET_BMI (that's lowered rather than split, > pre-load/post-STV). > > Unfortunately, the "force_reg of tricky immediate constants" checks really are > required for these expanders. I agree normally the predicate is > checked/guaranteed > for a define_insn, but in this case the gen_iordi3 function and related > expanders are > frequently called directly by the middle-end or from i386-expand, which > bypasses > the checks made by the later RTL passes. When given arbitrary immediate > constants, > this results in ICEs from insns not matching their predicates soon after > expand > (breaking bootstrap with an ICE). It's only "standard name" expanders that > require > this treatment, define_insn{_and_split} templates do enforce their predicates. > > And finally, we can't/shouldn't use <general_szext_operand> in the actual > doubleword splitters, as the mode being iterated over is DWIH (not DWI), > where we require the predicate for the corresponding <DWI> mode. It turns > out that it's always appropriate to use x86_64_hilo_general_operand wherever > we use the "r<di>" constraint, and that's used consistently in this patch. > > I hope these exceptions are acceptable. The attached revised patch has > been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check > both with and with --target_board=unix{-m32} with no new failures. > Are these revisions OK for mainline?
Thanks for your explanation of the particularities of the patch! Yes, the patch is OK. Thanks, Uros. > > 2022-06-30 Roger Sayle <ro...@nextmovesoftware.com> > Uroš Bizjak <ubiz...@gmail.com> > > gcc/ChangeLog > * config/i386/i386.md (general_szext_operand): Add TImode > support using x86_64_hilo_general_operand predicate. > (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate. > (*add<dwi>3_doubleword): Improved optimization of zero addition. > (and<mode>3): Use SDWIM mode iterator to add support for double > word bit-wise AND in TImode. Use force_reg when double word > immediate operand isn't x86_64_hilo_general_operand. > (and<dwi>3_doubleword): Generalized from anddi3_doubleword and > converted into a post-reload splitter. > (*andndi3_doubleword): Old define_insn deleted. > (*andn<mode>3_doubleword_bmi): New define_insn_and_split for > TARGET_BMI that splits post-reload. > (*andn<mode>3_doubleword): New define_insn_and_split for > !TARGET_BMI, that lowers/splits before reload. > (<any_or><mode>3): Use SDWIM mode iterator to add suppport for > double word bit-wise XOR and bit-wise IOR in TImode. Use > force_reg when double word immediate operand isn't > x86_64_hilo_general_operand. > (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword. > (one_cmpl<mode>2): Use SDWIM mode iterator to add support for > double word bit-wise NOT in TImode. > (one_cmpl<dwi>2_doubleword): Generalize from one_cmpldi2_doubleword > and converted into a post-reload splitter. > > > Thanks again, > Roger > -- > > > -----Original Message----- > > From: Uros Bizjak <ubiz...@gmail.com> > > Sent: 28 June 2022 16:38 > > To: Roger Sayle <ro...@nextmovesoftware.com> > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md. > > > > On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle <ro...@nextmovesoftware.com> > > wrote: > > > > > > > > > Hi Uros, > > > As you've requested/suggested, here's a patch that tidies up and > > > unifies doubleword handling in i386.md; converting all doubleword > > > splitters for logic operations to post-reload form, generalizing their > > > define_insn_and_split templates to <dwi> form (supporting TARGET_64BIT > > > ? TImode : DImode), and where required tweaking the corresponding > > > expanders to use SDWIM to support TImode doubleword operations. These > > > changes incorporate your feedback from > > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html > > > where I included many/several of these clean-ups, in a patch to add a > > > new optimization. I agree, it's better to split these out (this > > > patch), and I'll resubmit the (smaller) optimization patch as a > > > follow-up. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=unix{-m32}, > > > with no new failures. Ok for mainline? > > > > > > > > > 2022-06-28 Roger Sayle <ro...@nextmovesoftware.com> > > > > > > gcc/ChangeLog > > > * config/i386/i386.md (general_szext_operand): Add TImode > > > support using x86_64_hilo_general_operand predicate. > > > (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate. > > > (*add<dwi>3_doubleword): Improved optimization of zero addition. > > > (and<mode>3): Use SDWIM mode iterator to add support for double > > > word bit-wise AND in TImode. Use force_reg when double word > > > immediate operand isn't x86_64_hilo_general_operand. > > > (and<dwi>3_doubleword): Generalized from anddi3_doubleword and > > > converted into a post-reload splitter. > > > (*andn<mode>3_doubleword): Generalized from *andndi3_doubleword. > > > (define_split): Generalize DImode splitters for andn to <DWI>. > > > One splitter for TARGET_BMI, the other for !TARGET_BMI. > > > (<any_or><mode>3): Use SDWIM mode iterator to add suppport for > > > double word bit-wise XOR and bit-wise IOR in TImode. Use > > > force_reg when double word immediate operand isn't > > > x86_64_hilo_general_operand. > > > (*<any_or>di3_doubleword): Generalized from > > > <any_or>di3_doubleword. > > > (one_cmpl<mode>2): Use SDWIM mode iterator to add support for > > > double word bit-wise NOT in TImode. > > > (one_cmpl<dwi>2_doubleword): Generalize from > > one_cmpldi2_doubleword > > > and converted into a post-reload splitter. > > > > > > (define_expand "and<mode>3" > > - [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") > > - (and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand") > > - (match_operand:SWIM1248x 2 "<general_szext_operand>")))] > > + [(set (match_operand:SDWIM 0 "nonimmediate_operand") > > + (and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") > > + (match_operand:SDWIM 2 "<general_szext_operand>")))] > > "" > > { > > machine_mode mode = <MODE>mode; > > > > - if (<MODE>mode == DImode && !TARGET_64BIT) > > - ; > > - else if (const_int_operand (operands[2], <MODE>mode) > > - && register_operand (operands[0], <MODE>mode) > > - && !(TARGET_ZERO_EXTEND_WITH_AND > > - && optimize_function_for_speed_p (cfun))) > > + if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD > > + && !x86_64_hilo_general_operand (operands[2], <MODE>mode)) > > + operands[2] = force_reg (<MODE>mode, operands[2]); > > > > You don't have to do that - when the predicate can't be satisfied, the > > middle-end > > pushes the value to a register as a last resort by default. > > > > + bool emit_insn_deleted_note_p = false; > > + > > + split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], > > + &operands[3]); > > > > if (operands[2] == const0_rtx) > > emit_move_insn (operands[0], const0_rtx); > > else if (operands[2] == constm1_rtx) > > - emit_move_insn (operands[0], operands[1]); > > + { > > + if (!rtx_equal_p (operands[0], operands[1])) > > + emit_move_insn (operands[0], operands[1]); > > + else > > + emit_insn_deleted_note_p = true; > > + } > > > > Please note that when operands[2] is an immediate, constraints after reload > > *guarantee* that operands[1] match operands[0]. So, the insn should always > > be > > deleted (I think that this functionality was in your <any_or> patch - it is > > unneeded there, too). > > > > +(define_insn "*andn<mode>3_doubleword" > > + [(set (match_operand:DWI 0 "register_operand") > > + (and:DWI > > + (not:DWI (match_operand:DWI 1 "register_operand")) > > + (match_operand:DWI 2 "nonimmediate_operand"))) > > (clobber (reg:CC FLAGS_REG))] > > - "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 > > - && ix86_pre_reload_split ()" > > + "ix86_pre_reload_split ()" > > "#") > > > > Please introduce two ANDN double-word insn-and-split patterns, one for BMI > > and one for !BMI. The one for BMI should be moved to a post-reload splitter, > > too. As we figured out, *all* double-word patterns should either be of pre- > > reload or of post-reload type. > > > > (define_split > > - [(set (match_operand:DI 0 "register_operand") > > - (and:DI > > - (not:DI (match_operand:DI 1 "register_operand")) > > - (match_operand:DI 2 "nonimmediate_operand"))) > > + [(set (match_operand:DWI 0 "register_operand") > > + (and:DWI > > + (not:DWI (match_operand:DWI 1 "register_operand")) > > + (match_operand:DWI 2 "nonimmediate_operand"))) > > (clobber (reg:CC FLAGS_REG))] > > - "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2 > > + "!TARGET_BMI > > > > Without BMI, the ANDN should be split to a double-word NOT + AND before > > reload (and these two insns are split to single-word operations after > > reload). > > This simplifies splitting logic quite a bit. > > > > (define_expand "<code><mode>3" > > - [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") > > - (any_or:SWIM1248x (match_operand:SWIM1248x 1 > > "nonimmediate_operand") > > - (match_operand:SWIM1248x 2 "<general_operand>")))] > > + [(set (match_operand:SDWIM 0 "nonimmediate_operand") > > + (any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") > > + (match_operand:SDWIM 2 "<general_operand>")))] > > > > Use <general_szext_operand> here ... > > > > "" > > - "ix86_expand_binary_operator (<CODE>, <MODE>mode, operands); DONE;") > > +{ > > > > -(define_insn_and_split "*<code>di3_doubleword" > > - [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,r") > > - (any_or:DI > > - (match_operand:DI 1 "nonimmediate_operand" "0,0") > > - (match_operand:DI 2 "x86_64_szext_general_operand" "re,o"))) > > + if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD > > + && !x86_64_hilo_general_operand (operands[2], <MODE>mode)) > > + operands[2] = force_reg (<MODE>mode, operands[2]); > > > > ... to avoid the above fixup. > > > > +(define_insn_and_split "*<code><mode>3_doubleword" > > + [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r") > > + (any_or:<DWI> > > + (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0") > > + (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o"))) > > > > <general_szext_operand> for consistency. > > > > Otherwise OK. > > > > Uros.