Hi Tim, comments inline.
================
Comment at: lib/CodeGen/CGBuiltin.cpp:3089-3092
@@ -3077,7 +3088,6 @@
return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vshll", 1);
case ARM::BI__builtin_neon_vshl_n_v:
case ARM::BI__builtin_neon_vshlq_n_v:
- Ops[1] = EmitNeonShiftVector(Ops[1], Ty, false);
- return Builder.CreateShl(Builder.CreateBitCast(Ops[0],Ty), Ops[1],
- "vshl_n");
+ Int = usgn ? Intrinsic::arm_neon_vshiftu : Intrinsic::arm_neon_vshifts;
+ return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vshl_n", 1, false);
case ARM::BI__builtin_neon_vshrn_n_v:
----------------
Tim Northover wrote:
> All the documentation I can find says that vshl_n intrinsics take an integer
> from [0,B-1], which is in range for shift.
You're right, but for consistency, I eliminated all shifts instruction on
vectors. I can change that though.
================
Comment at: lib/CodeGen/CGBuiltin.cpp:3096-3099
@@ -3085,10 +3095,6 @@
Ops, "vshrn_n", 1, true);
case ARM::BI__builtin_neon_vshr_n_v:
case ARM::BI__builtin_neon_vshrq_n_v:
- Ops[0] = Builder.CreateBitCast(Ops[0], Ty);
- Ops[1] = EmitNeonShiftVector(Ops[1], Ty, false);
- if (usgn)
- return Builder.CreateLShr(Ops[0], Ops[1], "vshr_n");
- else
- return Builder.CreateAShr(Ops[0], Ops[1], "vshr_n");
+ Int = usgn ? Intrinsic::arm_neon_vshiftu : Intrinsic::arm_neon_vshifts;
+ return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vshr_n", 1, true);
case ARM::BI__builtin_neon_vsri_n_v:
----------------
Tim Northover wrote:
> The right shifts are more problematic. But the intrinsics again seem to be
> limited to [1, B] (and the value is known at compile-time).
>
> Couldn't we detect the maximum value and emit 0 for an unsigned shift and
> "icmp lt %val, 0" for a signed shift?
That's something I considered, but then how can we deal with VSRA?
With this solution, we can match `add (vshr ...)` for VSRA. It seems awkward to
add patterns to match the icmp/0 form, and I suppose we still want a VSRA
instruction to be emitted instead of a comparison and a vadd, don't we?
Of course we can always add an intrinsic for VSRA/VRSRA (which was already
emitting vshift instead of {a,l}shr btw).
http://llvm-reviews.chandlerc.com/D1819
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits