Re: [PATCH] [AArch64, NEON] Improve vpmaxX & vpminX intrinsics

Yangfei (Felix) Tue, 13 Jan 2015 23:09:51 -0800

> On 09/12/14 08:17, Yangfei (Felix) wrote:
> >> On 28 November 2014 at 09:23, Yangfei (Felix) <felix.y...@huawei.com>
> wrote:
> >>> Hi,
> >>>    This patch converts vpmaxX & vpminX intrinsics to use builtin
> >>> functions
> >> instead of the previous inline assembly syntax.
> >>>    Regtested with aarch64-linux-gnu on QEMU.  Also passed the
> >>> glorious
> >> testsuite of Christophe Lyon.
> >>>    OK for the trunk?
> >>
> >> Hi Felix,   We know from experience that the advsimd intrinsics tend
> >> to be fragile for big endian and in general it is fairly easy to
> >> break the big endian case.  For these advsimd improvements that you
> >> are working on (that we very much appreciate) it is important to run
> >> both little endian and big endian regressions.
> >>
> >> Thanks
> >> /Marcus
> >
> >
> > Okay.  Any plan for the advsimd big-endian improvement?
> > I rebased this patch over Alan Lawrance's patch:
> > https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00279.html
> > No regressions for aarch64_be-linux-gnu target too.  OK for the thunk?
> >
> >
> > Index: gcc/ChangeLog
> >
> =============================================================
> ======
> > --- gcc/ChangeLog       (revision 218464)
> > +++ gcc/ChangeLog       (working copy)
> > @@ -1,3 +1,18 @@
> > +2014-12-09  Felix Yang  <felix.y...@huawei.com>
> > +
> > +       * config/aarch64/aarch64-simd.md
> (aarch64_<maxmin_uns>p<mode>): New
> > +       pattern.
> > +       * config/aarch64/aarch64-simd-builtins.def (smaxp, sminp, umaxp,
> > +       uminp, smax_nanp, smin_nanp): New builtins.
> > +       * config/aarch64/arm_neon.h (vpmax_s8, vpmax_s16, vpmax_s32,
> > +       vpmax_u8, vpmax_u16, vpmax_u32, vpmaxq_s8, vpmaxq_s16,
> vpmaxq_s32,
> > +       vpmaxq_u8, vpmaxq_u16, vpmaxq_u32, vpmax_f32, vpmaxq_f32,
> vpmaxq_f64,
> > +       vpmaxqd_f64, vpmaxs_f32, vpmaxnm_f32, vpmaxnmq_f32,
> vpmaxnmq_f64,
> > +       vpmaxnmqd_f64, vpmaxnms_f32, vpmin_s8, vpmin_s16, vpmin_s32,
> vpmin_u8,
> > +       vpmin_u16, vpmin_u32, vpminq_s8, vpminq_s16, vpminq_s32,
> vpminq_u8,
> > +       vpminq_u16, vpminq_u32, vpmin_f32, vpminq_f32, vpminq_f64,
> vpminqd_f64,
> > +       vpmins_f32, vpminnm_f32, vpminnmq_f32, vpminnmq_f64,
> > + vpminnmqd_f64,
> > +
> 
> 
> >   __extension__ static __inline float32x2_t __attribute__
> > ((__always_inline__))
> > Index: gcc/config/aarch64/aarch64-simd.md
> >
> =============================================================
> ======
> > --- gcc/config/aarch64/aarch64-simd.md  (revision 218464)
> > +++ gcc/config/aarch64/aarch64-simd.md  (working copy)
> > @@ -1017,6 +1017,28 @@
> >     DONE;
> >   })
> >
> > +;; Pairwise Integer Max/Min operations.
> > +(define_insn "aarch64_<maxmin_uns>p<mode>"
> > + [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w")
> > +       (unspec:VDQ_BHSI [(match_operand:VDQ_BHSI 1
> "register_operand" "w")
> > +                        (match_operand:VDQ_BHSI 2 "register_operand"
> "w")]
> > +                       MAXMINV))]
> > + "TARGET_SIMD"
> > + "<maxmin_uns_op>p\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
> > +  [(set_attr "type" "neon_minmax<q>")]
> > +)
> > +
> 
> Hi Felix,
> 
> Sorry for the delay in getting back to you on this.
> 
> If you've rolled aarch64_reduc_<maxmin_uns>_internalv2si into the above
> pattern, do you still need it? For all its call points, just point them to
> aarch64_<maxmin_uns>p<mode>?
> 
> Thanks,
> Tejas.
>



Hello Tejas,

  I didn't do this yet. 
  Currently the aarch64_reduc_<maxmin_uns>_internalv2si is only called by 
reduc_<maxmin_uns>_scal_<mode>. 
  I find it kind of trouble to handle this due to the use of iterators in the 
caller pattern. 
  Are you going to rework this part?

Re: [PATCH] [AArch64, NEON] Improve vpmaxX & vpminX intrinsics

Reply via email to