> On 09/12/14 08:17, Yangfei (Felix) wrote: > >> On 28 November 2014 at 09:23, Yangfei (Felix) <felix.y...@huawei.com> > wrote: > >>> Hi, > >>> This patch converts vpmaxX & vpminX intrinsics to use builtin > >>> functions > >> instead of the previous inline assembly syntax. > >>> Regtested with aarch64-linux-gnu on QEMU. Also passed the > >>> glorious > >> testsuite of Christophe Lyon. > >>> OK for the trunk? > >> > >> Hi Felix, We know from experience that the advsimd intrinsics tend > >> to be fragile for big endian and in general it is fairly easy to > >> break the big endian case. For these advsimd improvements that you > >> are working on (that we very much appreciate) it is important to run > >> both little endian and big endian regressions. > >> > >> Thanks > >> /Marcus > > > > > > Okay. Any plan for the advsimd big-endian improvement? > > I rebased this patch over Alan Lawrance's patch: > > https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00279.html > > No regressions for aarch64_be-linux-gnu target too. OK for the thunk? > > > > > > Index: gcc/ChangeLog > > > ============================================================= > ====== > > --- gcc/ChangeLog (revision 218464) > > +++ gcc/ChangeLog (working copy) > > @@ -1,3 +1,18 @@ > > +2014-12-09 Felix Yang <felix.y...@huawei.com> > > + > > + * config/aarch64/aarch64-simd.md > (aarch64_<maxmin_uns>p<mode>): New > > + pattern. > > + * config/aarch64/aarch64-simd-builtins.def (smaxp, sminp, umaxp, > > + uminp, smax_nanp, smin_nanp): New builtins. > > + * config/aarch64/arm_neon.h (vpmax_s8, vpmax_s16, vpmax_s32, > > + vpmax_u8, vpmax_u16, vpmax_u32, vpmaxq_s8, vpmaxq_s16, > vpmaxq_s32, > > + vpmaxq_u8, vpmaxq_u16, vpmaxq_u32, vpmax_f32, vpmaxq_f32, > vpmaxq_f64, > > + vpmaxqd_f64, vpmaxs_f32, vpmaxnm_f32, vpmaxnmq_f32, > vpmaxnmq_f64, > > + vpmaxnmqd_f64, vpmaxnms_f32, vpmin_s8, vpmin_s16, vpmin_s32, > vpmin_u8, > > + vpmin_u16, vpmin_u32, vpminq_s8, vpminq_s16, vpminq_s32, > vpminq_u8, > > + vpminq_u16, vpminq_u32, vpmin_f32, vpminq_f32, vpminq_f64, > vpminqd_f64, > > + vpmins_f32, vpminnm_f32, vpminnmq_f32, vpminnmq_f64, > > + vpminnmqd_f64, > > + > > > > __extension__ static __inline float32x2_t __attribute__ > > ((__always_inline__)) > > Index: gcc/config/aarch64/aarch64-simd.md > > > ============================================================= > ====== > > --- gcc/config/aarch64/aarch64-simd.md (revision 218464) > > +++ gcc/config/aarch64/aarch64-simd.md (working copy) > > @@ -1017,6 +1017,28 @@ > > DONE; > > }) > > > > +;; Pairwise Integer Max/Min operations. > > +(define_insn "aarch64_<maxmin_uns>p<mode>" > > + [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w") > > + (unspec:VDQ_BHSI [(match_operand:VDQ_BHSI 1 > "register_operand" "w") > > + (match_operand:VDQ_BHSI 2 "register_operand" > "w")] > > + MAXMINV))] > > + "TARGET_SIMD" > > + "<maxmin_uns_op>p\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>" > > + [(set_attr "type" "neon_minmax<q>")] > > +) > > + > > Hi Felix, > > Sorry for the delay in getting back to you on this. > > If you've rolled aarch64_reduc_<maxmin_uns>_internalv2si into the above > pattern, do you still need it? For all its call points, just point them to > aarch64_<maxmin_uns>p<mode>? > > Thanks, > Tejas. >
Hello Tejas, I didn't do this yet. Currently the aarch64_reduc_<maxmin_uns>_internalv2si is only called by reduc_<maxmin_uns>_scal_<mode>. I find it kind of trouble to handle this due to the use of iterators in the caller pattern. Are you going to rework this part?