On Fri, May 15, 2026 at 3:46 AM Liu, Hongtao <[email protected]> wrote: > > > > > -----Original Message----- > > From: Roger Sayle <[email protected]> > > Sent: Friday, May 15, 2026 5:23 AM > > To: 'GCC Patches' <[email protected]> > > Cc: 'Hongtao Liu' <[email protected]>; Liu, Hongtao > > <[email protected]>; 'Uros Bizjak' <[email protected]> > > Subject: [PATCH] Improve vector increment/decrement on x86. > > > > > > This patch improves the code generated by the i386 backend for incrementing > > (adding one to) and decrementing (subtracting one from) a vector. With SSE > > materializing the vector -1 is more efficient than materializing the vector > > +1, > > hence x + 1 (increment) is better expressed as x - (-1), and x - 1 > > (decrement) is > > better expressed as x + (-1). Conveniently the relevant additions and > > subtractions are specified as a single pattern, using a plusminus iterator, > > in the > > machine description. > > Can we add pre_reload define_insn_and_split for them, > > (set (reg:V16QI 100 [ _2 ]) > (minus:V16QI (reg:V16QI 107 [ x ]) > (const_vector:V16QI [ > (const_int 1 [0x1]) repeated x16 > ]))) > > Theoretically, it should be able to capture more optimization opportunities > (if vector +/-1 is only exposed through RTL optimization)
IMO in this case, the expander is a better solution, because expander generates constants that allow optimized implementation (vpcmpeqd). Later, optimizers can do their magic with this optimized insn. + if (<CODE> == PLUS) + insn = gen_sub<mode>3 (operands[0], operands[1], operands[2]); + else + insn = gen_add<mode>3 (operands[0], operands[1], operands[2]); + emit_insn (insn); Please rather add code attribute for inverse operation (we already have an example with counter rotate): ;; Inverse instruction base name (define_code_attr inv_insn [(plus "sub") (minus "add")]) You then just use: emit_insn (gen_<inv_insn><mode>3 ( ... ); + DONE; + } Please put some vertical space here ... + if (CONST_VECTOR_P (operands[2])) + operands[2] = force_reg (<MODE>mode, operands[2]); ... and here. + ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands); +}) Uros.
