[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
inouehrs added inline comments. Comment at: test/CodeGen/builtins-ppc-error.c:23 +void testXXPERMDI(void) { + int index = 5; + vec_xxpermdi(vsi); //expected-error {{too few arguments to function call, expected at least 3, have 1}} I am not sure we can assure that clang always do a constant propagation to resolve `index` as a compile time constant. But it seems that an existing test case above already assumes clang does it. IMO, `const unsigned index = 5;` is a little better. https://reviews.llvm.org/D33053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
nemanjai added a comment. Other than the few minor comments, this LGTM. Comment at: lib/CodeGen/CGBuiltin.cpp:8458 +if (getTarget().isLittleEndian()) { + ElemIdx0 = (~Index & 1) + 2; + ElemIdx1 = (~Index & 2) >> 1; Minor nit: please add a comment explaining the expressions. For example: ``` // Element zero comes from the first input vector and element one comes from the // second. The element indices within each vector are numbered in big-endian // order so the shuffle mask must be adjusted for this on little endian // platforms (i.e. index is complemented and source vector reversed). ``` Comment at: lib/Sema/SemaChecking.cpp:3944 +QualType Arg21ElemTy = Arg2Ty->getAs()->getElementType(); +if (Arg1ElemTy != Arg21ElemTy) + return Diag(BuiltinLoc, diag::err_vec_builtin_incompatible_vector) It seems strange that we're comparing argument types above and element types of argument vectors here. Can we not just use `hasSameUnqualifiedType` like the Sema checks on `__builtin_shufflevector` do? Comment at: test/CodeGen/builtins-ppc-error.c:27 + vec_xxpermdi(vsi, vsi, index); //expected-error {{argument 3 to '__builtin_vsx_xxpermdi' must be a 2-bit unsigned literal (i.e. 0,1,2 or 3)}} + vec_xxpermdi(vsi, vuc, 2); //expected-error {{first two arguments to '__builtin_vsx_xxpermdi' must have the same type}} +} Add a test for non-vector arguments. Perhaps something like: `vec_xxpermdi(1, 2, 3);` https://reviews.llvm.org/D33053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
jtony updated this revision to Diff 99292. jtony added a comment. Address all the comments from Nemanja. https://reviews.llvm.org/D33053 Files: include/clang/Basic/BuiltinsPPC.def include/clang/Basic/DiagnosticSemaKinds.td include/clang/Sema/Sema.h lib/CodeGen/CGBuiltin.cpp lib/Headers/altivec.h lib/Sema/SemaChecking.cpp test/CodeGen/builtins-ppc-error.c test/CodeGen/builtins-ppc-vsx.c Index: test/CodeGen/builtins-ppc-vsx.c === --- test/CodeGen/builtins-ppc-vsx.c +++ test/CodeGen/builtins-ppc-vsx.c @@ -1691,4 +1691,60 @@ res_vd = vec_neg(vd); // CHECK: fsub <2 x double> , {{%[0-9]+}} // CHECK-LE: fsub <2 x double> , {{%[0-9]+}} + +res_vd = vec_xxpermdi(vd, vd, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vf = vec_xxpermdi(vf, vf, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsll = vec_xxpermdi(vsll, vsll, 2); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vull = vec_xxpermdi(vull, vull, 3); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsi = vec_xxpermdi(vsi, vsi, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vui = vec_xxpermdi(vui, vui, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vss = vec_xxpermdi(vss, vss, 2); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vus = vec_xxpermdi(vus, vus, 3); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsc = vec_xxpermdi(vsc, vsc, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vuc = vec_xxpermdi(vuc, vuc, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +} + +// The return type of the call expression may be different from the return type of the shufflevector. +// Wrong implementation could crash the compiler, add this test case to check that and avoid ICE. +vector int should_not_assert(vector int a, vector int b) { + return vec_xxpermdi(a, b, 0); +// CHECK-LABEL: should_not_assert +// CHECK: bitcast <4 x i32> %{{[0-9]+}} to <2 x i64> +// CHECK-NEXT: bitcast <4 x i32> %{{[0-9]+}} to <2 x i64> +// CHECK-NEXT: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-NEXT: bitcast <2 x i64> %{{[0-9]+}} to <4 x i32> + +// CHECK-LE: bitcast <4 x i32> %{{[0-9]+}} to <2 x i64> +// CHECK-LE-NEXT: bitcast <4 x i32> %{{[0-9]+}} to <2 x i64> +// CHECK-LE-NEXT: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE-NEXT: bitcast <2 x i64> %{{[0-9]+}} to <4 x i32> } Index: test/CodeGen/builtins-ppc-error.c === --- test/CodeGen/builtins-ppc-error.c +++ test/CodeGen/builtins-ppc-error.c @@ -13,8 +13,16 @@ extern vector signed int vsi; extern vector unsigned char vuc; -void testInsertWord1(void) { +void testInsertWord(void) { int index = 5; vector unsigned char v1 = vec_insert4b(vsi, vuc, index); // expected-error {{argument to '__builtin_vsx_insertword' must be a constant integer}} vector unsigned long long v2 = vec_extract4b(vuc, index); // expected-error {{argument to '__builtin_vsx_extractuword' must be a constant integer}} } + +void testXXPERMDI(void) { + int index = 5; + vec_xxpermdi(vsi); //expected-error {{too few arguments to function call, expected at least 3, have 1}} + vec_xxpermdi(vsi, vsi, 2, 4); //expected-error {{too many arguments to function call, expected at most 3, have 4}} + vec_xxpermdi(vsi, vsi, index); //expected-error {{argument 3 to '__builtin_vsx_xxpermdi' must be a 2-bit unsigned literal (i.e. 0,1,2 or 3)}} + vec_xxpermdi(vsi, vuc, 2); //expected-error {{first two arguments to '__builtin_vsx_xxpermdi' must have the same type}} +} Index: lib/Sema/SemaChecking.cpp === ---
[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
jtony marked 6 inline comments as done. jtony added inline comments. Comment at: lib/CodeGen/CGBuiltin.cpp:8433 +if (getTarget().isLittleEndian()) { + switch (Index) { + case 0: nemanjai wrote: > The switch is overkill. You should just implement this in an obvious way > (i.e. the same way as described in the ISA). > For big endian: > `ElemIdx0 = (Index & 2;) >> 1` > `ElemIdx1 = 2 + (Index & 1)` > > For little endian: > `ElemIdx0 = (~Index & 1) + 2;` > `ElemIdx1 = ~Index & 2 >> 1;` > > (of course, please verify the expressions). Good call, fixed as suggested. https://reviews.llvm.org/D33053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
nemanjai requested changes to this revision. nemanjai added a comment. This revision now requires changes to proceed. Add a test case like the one that currently crashes (see inline comment). Also, please do the following: - Put a note in the description (and the commit message) with a link to the PR this fixes - Put a note in the description that there is subsequent back end work to identify the correct shuffles (i.e. we currently won't emit an XXPERMDI when parameters are not vectors of doubleword elements) Comment at: include/clang/Basic/DiagnosticSemaKinds.td:7997 -def err_shufflevector_non_vector : Error< - "first two arguments to __builtin_shufflevector must be vectors">; It isn't appropriate to remove a target-independent diagnostic and replace it with a target-specific one (even if it's just in the name). I think it makes sense to update both `err_shufflevector_non_vector` and `err_shufflevector_incompatible_vector` to take an argument for the name of the builtin, but not to replace it with a target-specific one. I would recommend renaming it to something like: `err_shufflevector_non_vector` -> `err_vec_builtin_non_vector` `err_shufflevector_incompatible_vector` -> `err_vec_builtin_incompatible_vector` and add the parameter. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:8005 +def err_vsx_builtin_nonconstant_argument3 : Error< + "third argument to %0 must be a constant integer between 0-3">; We shouldn't diagnose out-of-range values. Comment at: lib/CodeGen/CGBuiltin.cpp:8407 // Create a shuffle mask of (1, 0) - Constant *ShuffleElts[2] = { ConstantInt::get(Int32Ty, 1), - ConstantInt::get(Int32Ty, 0) - }; + Constant *ShuffleElts[2] = {ConstantInt::get(Int32Ty, 1), + ConstantInt::get(Int32Ty, 0)}; This change is unrelated and inconsequential. Please revert it. Comment at: lib/CodeGen/CGBuiltin.cpp:8425 +const int64_t MaxIndex = 3; +int64_t Index = clamp(ArgCI->getSExtValue(), 0, MaxIndex); +Ops[0] = Builder.CreateBitCast(Ops[0], llvm::VectorType::get(Int64Ty, 2)); We shouldn't clamp this. The correct semantics should be to mask out everything but the low order two bits (i.e. `&0x3`). But I think even that is unnecessary since you should just use the low order two bits from the input - see below. Comment at: lib/CodeGen/CGBuiltin.cpp:8433 +if (getTarget().isLittleEndian()) { + switch (Index) { + case 0: The switch is overkill. You should just implement this in an obvious way (i.e. the same way as described in the ISA). For big endian: `ElemIdx0 = (Index & 2;) >> 1` `ElemIdx1 = 2 + (Index & 1)` For little endian: `ElemIdx0 = (~Index & 1) + 2;` `ElemIdx1 = ~Index & 2 >> 1;` (of course, please verify the expressions). Comment at: lib/CodeGen/CGBuiltin.cpp:8482 +Builder.CreateShuffleVector(Ops[0], Ops[1], ShuffleMask); +return ShuffleCall; + } This is not going to work. You can try it with this simple test case that it crashes with: ``` #include vector int test(vector int a, vector int b) { return vec_xxpermdi(a, b, 0); } ``` The problem is that the return type of the call expression may be different from the return type of the `shufflevector`. You'll probably need something like this: ``` QualType BIRetType = E->getType(); auto RetTy = ConvertType(BIRetType); return Builder.CreateBitCast(ShuffleCall, RetTy); ``` Comment at: lib/Sema/SemaChecking.cpp:3900 +// vector short vec_xxsldwi(vector short, vector short, int); +bool Sema::SemaBuiltinVSX(CallExpr *TheCall, unsigned NumArgs) { + if (TheCall->getNumArgs() < NumArgs) I assume that we won't even need this at all if we're not diagnosing the range of the third argument. Comment at: lib/Sema/SemaChecking.cpp:3915 + llvm::APSInt Value; + bool IsICE = TheCall->getArg(2)->isIntegerConstantExpr(Value, Context); + if (!(IsICE && Value >= 0 && Value <= 3)) No, we don't want to diagnose this. Values larger than 3 should just use the low-order two bits. This matches GCC behaviour. https://reviews.llvm.org/D33053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33053: [PowerPC] Implement vec_xxpermdi builtin.
jtony created this revision. The vec_xxpermdi builtin is missing from altivec.h. This has been requested by developers working on libvpx for VP9 support for Google. Initially, I tried to define a new intrinsic to map it to the corresponding PowerPC hard instruction (XXPERMDI) directly. But there was feedback from the community that this can be done without introducing new intrinsic. This patch re-implement the vec_xxpermdi builtin by using the existing shuffleVector instruction just in the FE. https://reviews.llvm.org/D33053 Files: include/clang/Basic/BuiltinsPPC.def include/clang/Basic/DiagnosticSemaKinds.td include/clang/Sema/Sema.h lib/CodeGen/CGBuiltin.cpp lib/Headers/altivec.h lib/Sema/SemaChecking.cpp test/CodeGen/builtins-ppc-error.c test/CodeGen/builtins-ppc-vsx.c Index: test/CodeGen/builtins-ppc-vsx.c === --- test/CodeGen/builtins-ppc-vsx.c +++ test/CodeGen/builtins-ppc-vsx.c @@ -1691,4 +1691,44 @@ res_vd = vec_neg(vd); // CHECK: fsub <2 x double> , {{%[0-9]+}} // CHECK-LE: fsub <2 x double> , {{%[0-9]+}} + +res_vd = vec_xxpermdi(vd, vd, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vf = vec_xxpermdi(vf, vf, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsll = vec_xxpermdi(vsll, vsll, 2); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vull = vec_xxpermdi(vull, vull, 3); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsi = vec_xxpermdi(vsi, vsi, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vui = vec_xxpermdi(vui, vui, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vss = vec_xxpermdi(vss, vss, 2); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vus = vec_xxpermdi(vus, vus, 3); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vsc = vec_xxpermdi(vsc, vsc, 0); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> + +res_vuc = vec_xxpermdi(vuc, vuc, 1); +// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> +// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> } Index: test/CodeGen/builtins-ppc-error.c === --- test/CodeGen/builtins-ppc-error.c +++ test/CodeGen/builtins-ppc-error.c @@ -13,8 +13,17 @@ extern vector signed int vsi; extern vector unsigned char vuc; -void testInsertWord1(void) { +void testInsertWord(void) { int index = 5; vector unsigned char v1 = vec_insert4b(vsi, vuc, index); // expected-error {{argument to '__builtin_vsx_insertword' must be a constant integer}} vector unsigned long long v2 = vec_extract4b(vuc, index); // expected-error {{argument to '__builtin_vsx_extractuword' must be a constant integer}} } + +void testXXPERMDI(void) { + int index = 5; + vec_xxpermdi(vsi); //expected-error {{too few arguments to function call, expected at least 3, have 1}} + vec_xxpermdi(vsi, vsi, 2, 4); //expected-error {{too many arguments to function call, expected at most 3, have 4}} + vec_xxpermdi(vsi, vsi, index); //expected-error {{third argument to '__builtin_vsx_xxpermdi' must be a constant integer between 0-3}} + vec_xxpermdi(vsi, vsi, -1); //expected-error {{third argument to '__builtin_vsx_xxpermdi' must be a constant integer between 0-3}} + vec_xxpermdi(vsi, vuc, 2); //expected-error {{first two arguments to '__builtin_vsx_xxpermdi' must be the same type}} +} Index: lib/Sema/SemaChecking.cpp === --- lib/Sema/SemaChecking.cpp +++ lib/Sema/SemaChecking.cpp @@ -1696,6 +1696,8 @@ case PPC::BI__builtin_tabortdci: return SemaBuiltinConstantArgRange(TheCall, 0, 0, 31) || SemaBuiltinConstantArgRange(TheCall, 2, 0, 31); + case PPC::BI__builtin_vsx_xxpermdi: +return SemaBuiltinVSX(TheCall, 3); } return SemaBuiltinConstantArgRange(TheCall, i, l,