On Fri, Jun 06, 2025 at 03:52:12PM +0100, Richard Sandiford wrote: > Spencer Abson <spencer.ab...@arm.com> writes: > > @@ -8165,20 +8169,25 @@ > > ;; > > ;; For unpacked vectors, it doesn't really matter whether SEL uses the > > ;; the container size or the element size. If SEL used the container size, > > -;; it would ignore undefined bits of the predicate but would copy the > > -;; upper (undefined) bits of each container along with the defined bits. > > -;; If SEL used the element size, it would use undefined bits of the > > predicate > > -;; to select between undefined elements in each input vector. Thus the > > only > > -;; difference is whether the undefined bits in a container always come from > > -;; the same input as the defined bits, or whether the choice can vary > > +;; it would ignore bits of the predicate that can be undefined, but would > > copy > > +;; the upper (undefined) bits of each container along with the defined > > bits. > > +;; If SEL used the element size, it would use bits from the predicate that > > can > > +;; be undefined to select between undefined elements in each input vector. > > +;; Thus the only difference is whether the undefined bits in a container > > always > > +;; come from the same input as the defined bits, or whether the choice can > > vary > > It looks like the main change here is to replace "undefined bits of the > predicate" with "bits of the predicate that can be undefined". Could > you go into more detail about the distinction? It seems to be saying > that the upper bits in each predicate are sometimes defined and > sometimes not. > > As I see it, the "level of undefinedness" is really the same for the > predicates and data vectors. Within each container element, the bits > that hold the element value are defined/significant and the other bits > are undefined/insignificant/have arbitrary values. The same thing > goes for the upper bits in each predicate element: the low bit is > defined/significant and the other bits are undefined/insignificant/have > arbitrary values. They might by chance be zeros when zeros are > convenient or ones when ones are convenient, but the semantics don't > guarantee anything either way.
Yes, I agree. Sorry, my change was not very clear. What I was trying to reflect is that, for example, selecting between a pair of VNx4HF using VNx8BI is now a recognised insn. However, any bits of a VNx8BI that are not significant to a VNx4BI are don't-care wrt the result. I meant that they 'can be undefined' in that they are as good as undefined, for the purpose of SEL. Maybe a better change would have been to reword this paragraph in favour of 'don't-care' rather than 'undefined' when describing the upper bits of each predicate element? > > > ;; independently of the defined bits. > > ;; > > ;; For the other instructions, using the element size is more natural, > > ;; so we do that for SEL as well. > > +;; > > +;; The use of 'aarch64_predicate_operand' here is only to support the FP > > arithmetic/ > > +;; UNSPEC_SEL combiner patterns. As with those operations, any predicate > > bits that > > +;; are insignificant to the data mode will have no effect on the > > operation's result. > > Sorry for the formatting nit, but: long lines. The comment itself looks good > though. > > > +;; > > (define_insn "*vcond_mask_<mode><vpred>" > > [(set (match_operand:SVE_ALL 0 "register_operand") > > (unspec:SVE_ALL > > - [(match_operand:<VPRED> 3 "register_operand") > > + [(match_operand:<VPRED> 3 "aarch64_predicate_operand") > > (match_operand:SVE_ALL 1 "aarch64_sve_reg_or_dup_imm") > > (match_operand:SVE_ALL 2 "aarch64_simd_reg_or_zero")] > > UNSPEC_SEL))] > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > > index 287de0f5ae4..d38b108c5f4 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -3893,6 +3893,33 @@ aarch64_sve_fp_pred (machine_mode data_mode, rtx > > *strictness) > > return aarch64_ptrue_reg (aarch64_sve_pred_mode (data_mode)); > > } > > > > +/* If DATA_MODE is a partial vector mode, emit a sequence of insns to > > + zero-out the predicate bits of an existing natural GP, PRED, associated > > + with the undefined elements in each container > > This makes it sound unconditional, whereas it's really conditional on > flag_trapping_math. > > Also, I'd see this more as converting a container predicate to an > element predicate. So maybe: > > /* PRED is a predicate that governs an operation on DATA_MODE. If DATA_MODE > is a partial vector mode, and if exceptions must be suppressed for its > undefined elements, convert PRED from a container-level predicate to > an element-level predicate and ensure that the undefined elements > are inactive. > > But again, please suggest something else if you don't think that's > very clear. Yeah, happy to stick to the container/element size language. Thanks. > > > + > > + Return NULL_RTX if no insns were emitted. That is, if DATA_MODE is not > > + a partial vector mode, or if we don't need to prevent the operation from > > + interpreting undefined elements. Otherwise, return the new predicate. > > */ > > +rtx > > +aarch64_sve_emit_masked_fp_pred (machine_mode data_mode, rtx pred) > > +{ > > + unsigned int vec_flags = aarch64_classify_vector_mode (data_mode); > > + if (flag_trapping_math && (vec_flags & VEC_PARTIAL)) > > + { > > + /* Control the data as if the vector was fully packed. */ > > + rtx mask = aarch64_sve_packed_pred (data_mode); > > + machine_mode pmode = GET_MODE (mask); > > + > > + /* Mask the existing predicate. */ > > + rtx dst = gen_reg_rtx (pmode); > > + emit_insn (gen_and3 (pmode, dst, mask, > > + gen_lowpart (pmode, pred))); > > + return dst; > > + } > > + > > + return NULL_RTX; > > +} > > How about instead returning PRED in the fallthrough case, rather than > null? It seems like that would be more convenient for the callers > in this patch and in patch 14, and would also be more consistent > with aarch64_sve_fp_pred. Sure, that makes sense. I likely wanted to return something that easily indicates whether the function had side-effects or not. I'll give it some more thought, but happy with your suggestion too. > > As with the earlier patches, it would be good to have a token test > that SELs aren't dropped when they shouldn't be. > > BTW, since the series started with a lot of patches that were OK as-is, > or with minor changes, would you be ok with committing those and only > reposting the part of the series that might need a v2? Yeah, I'll leave some time for discussion and commit the ones with minor changes if that's OK. Thanks, Spencer > > Thanks, > Richard