Hi, Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as undefined/impossible, the neon intrinsics vabs intrinsics should behave as the hardware. That is to say, the pseudo-code sequence:
a = vabs_s8 (vdup_n_s8 (-128)); assert (a >= 0); does not hold. As in hardware abs (-128) == -128 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should avoid it. In fact, we have to be even more careful than that, and keep the integer vabs intrinsics as an unspec in the back end. We keep the standard pattern name around for the benefit of auto-vectorization. Tested on aarch64-none-elf with no issues. This will also be a bug on 4.9 (ugh), OK for trunk and gcc-4_9-branch? Thanks, James --- 2014-05-02 James Greenhalgh <james.greenha...@arm.com> * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin): Don't fold integer abs builtins. * config/aarch64/aarch64-simd-builtins.def (abs): Split by integer and floating point variants. * config/aarch64/aarch64-simd.md (aarch64_abs<mode>): New. * config/aarch64/iterators.md (unspec): Add UNSPEC_ABS.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index a301982..6d47c0b 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1153,7 +1153,7 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args, switch (fcode) { - BUILTIN_VALLDI (UNOP, abs, 2) + BUILTIN_VDQF (UNOP, abs, 2) return fold_build1 (ABS_EXPR, type, args[0]); break; BUILTIN_VALLDI (BINOP, cmge, 0) diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 339e8f8..e2d1078 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -365,7 +365,8 @@ BUILTIN_VDQF (UNOP, frecpe, 0) BUILTIN_VDQF (BINOP, frecps, 0) - BUILTIN_VALLDI (UNOP, abs, 2) + BUILTIN_VDQ (UNOP, abs, 0) + BUILTIN_VDQF (UNOP, abs, 2) VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf) VAR1 (BINOP, float_truncate_hi_, 0, v4sf) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 108bc8d88931e67e6c7eeb77774a01bb391a1ced..acb75f5bd0c732d8e11d4a7b6b61f8b1e81d1960 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -390,6 +390,18 @@ (define_insn "aba<mode>_3" [(set_attr "type" "neon_arith_acc<q>")] ) +;; To mirror the behaviour of hardware, as required for arm_neon.h, we must +;; show an abundance of caution around the abs instruction. + +(define_insn "aarch64_abs<mode>" + [(set (match_operand:VDQ 0 "register_operand" "=w") + (unspec:VDQ [(match_operand:VDQ 1 "register_operand" "w")] + UNSPEC_ABS))] + "TARGET_SIMD" + "abs\t%0.<Vtype>, %1.<Vtype>" + [(set_attr "type" "neon_abs<q>")] +) + (define_insn "fabd<mode>_3" [(set (match_operand:VDQF 0 "register_operand" "=w") (abs:VDQF (minus:VDQF diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c537c3780eea95fa315c82bb36ac7f91f0f920fd..e45a1a11991a71ad37a8d5bb7c4ff81627671384 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -197,6 +197,7 @@ (define_c_enum "unspec" [ UNSPEC_ASHIFT_SIGNED ; Used in aarch-simd.md. UNSPEC_ASHIFT_UNSIGNED ; Used in aarch64-simd.md. + UNSPEC_ABS ; Used in aarch64-simd.md. UNSPEC_FMAX ; Used in aarch64-simd.md. UNSPEC_FMAXNMV ; Used in aarch64-simd.md. UNSPEC_FMAXV ; Used in aarch64-simd.md.