On Fri, May 10, 2019 at 9:42 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Fri, May 10, 2019 at 9:25 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Fri, May 10, 2019 at 9:10 AM Richard Biener > > <richard.guent...@gmail.com> wrote: > > > > > > On Fri, May 10, 2019 at 12:44 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > >> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware > > > > >> does because we always only consider the last 6 bits of a shift > > > > >> operand.> > > > > >> Despite all the warnings in the other backends, most notably > > > > >> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I > > > > >> wrote the attached tentative patch. It's a little ad-hoc, uses the > > > > >> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 > > > > >> and, > > > > >> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies > > > > >> the mask returned by shift_truncation_mask. Doing so, usage of both > > > > >> "methods" actually reduces to a single way. > > > > > THe main reason it's discouraged is because some targets have insns > > > > > where the count would be truncated and others where it would not. So > > > > > for example normal shifts might truncate, but vector shifts might or > > > > > (mips) or shifts might truncate but bit tests do not (x86). > > > > > > > > Bit tests on x86 also truncate [1], if the bit base operand specifies > > > > a register, and we don't use BT with a memory location as a bit base. > > > > I don't know what is referred with "(real or pretended) bit field > > > > operations" in the documentation for SHIFT_COUNT_TRUNCATED: > > > > > > > > However, on some machines, such as the 80386 and the 680x0, > > > > truncation only applies to shift operations and not the (real or > > > > pretended) bit-field operations. Define 'SHIFT_COUNT_TRUNCATED' to > > > > > > > > Vector shifts don't truncate on x86, so x86 probably shares the same > > > > destiny with MIPS. Maybe a machine mode argument could be passed to > > > > SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes > > > > that don't. > > > > > > But IL semantic differences based on mode is even worse. What happens > > > if STV then substitues a vector register/op but you previously optimized > > > with the assumption the count would be truncated since the shift was > > > SImode? > > > > I have removed support to STV shifts with dynamic count just because > > of the issue you mentioned. It is not possible to substitute > > truncating shift with a non-truncating one in STV, so there is IMO no > > problem with the proposed idea. > > > > BTW: We do implement the removal of useless count argument masking > > (for shifts and bit test insns) with several define_insn_and_split > > patterns that allow combine to create compound insn (please grep for > > "Avoid usless masking" in i386.md). However, the generic handling > > would probably be more effective, since the implemented approach > > doesn't remove masking in cases masked argument is used in several > > shift instructions. > > But that's more a combine limitation than a reason going for the > "hidden" IL semantic change. But yes, if the and is used by > non-masking insns then it's likely cheap enough to retain it. > > If the masking were always in place (combined with the shift > if a suitable insn exists) then STV handling should be possible, > it just would need to split the insn to do the masking and then the shift > (of course that might not be very profitable).
Unfortunately, STV substitutes a register in one go, so if we have or $cnt, $0x12345 shl $reg, $cnt the sequence gets converted to: <load $xmm_imm with $0x12345> por $xmm_cnt, $xmm_imm psll $xmm_reg, $xmm_cnt which is not the same; we would need to mask $xmm_cnt inbetween insns. Uros. > Richard. > > > Uros. > > > > > IMHO a recipie for desaster. > > > > > > Richard. > > > > > > > [1] https://www.felixcloutier.com/x86/bt > > > > > > > > Uros.