[PATCH] D127976: [IR] Move vector.insert/vector.extract out of experimental namespace

2022-06-27 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa83aa33d1bf9: [IR] Move vector.insert/vector.extract out of 
experimental namespace (authored by bsmith).

Changed prior to commit:
  https://reviews.llvm.org/D127976?vs=438975=440156#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127976/new/

https://reviews.llvm.org/D127976

Files:
  clang/include/clang/Basic/riscv_vector.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget-vset-ice.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vset.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-vls-arith-ops.c
  clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
  clang/test/CodeGen/aarch64-sve-vls-compare-ops.c
  clang/test/CodeGen/aarch64-sve-vls-shift-ops.c
  clang/test/CodeGen/aarch64-sve-vls-subscript-ops.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_dup_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_get_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_set_neonq.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  llvm/docs/LangRef.rst
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/IR/IRBuilder.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
  llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll.bc
  llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll
  llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
  llvm/test/CodeGen/AArch64/split-vector-insert.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-vector-to-predicate-store.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-extract-subvector.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector-to-predicate-load.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-no-typesize-warnings.ll
  llvm/test/CodeGen/AArch64/sve-punpklo-combine.ll
  llvm/test/CodeGen/AArch64/sve-vecreduce-fold.ll
  llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/vpload.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstSimplify/extract-vector.ll
  llvm/test/Transforms/InstSimplify/insert-vector.ll
  llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
  llvm/test/Verifier/extract-vector-mismatched-element-types.ll
  llvm/test/Verifier/insert-extract-intrinsics-invalid.ll
  llvm/test/Verifier/insert-vector-mismatched-element-types.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127976: [IR] Move vector.insert/vector.extract out of experimental namespace

2022-06-22 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 438975.
bsmith added a comment.

- Further improve clarity on usable types in LangRef


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127976/new/

https://reviews.llvm.org/D127976

Files:
  clang/include/clang/Basic/riscv_vector.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vset.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-vls-arith-ops.c
  clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
  clang/test/CodeGen/aarch64-sve-vls-compare-ops.c
  clang/test/CodeGen/aarch64-sve-vls-shift-ops.c
  clang/test/CodeGen/aarch64-sve-vls-subscript-ops.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_dup_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_get_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_set_neonq.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  llvm/docs/LangRef.rst
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/IR/IRBuilder.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
  llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll.bc
  llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll
  llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
  llvm/test/CodeGen/AArch64/split-vector-insert.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-vector-to-predicate-store.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-extract-subvector.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector-to-predicate-load.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector.ll
  llvm/test/CodeGen/AArch64/sve-no-typesize-warnings.ll
  llvm/test/CodeGen/AArch64/sve-punpklo-combine.ll
  llvm/test/CodeGen/AArch64/sve-vecreduce-fold.ll
  llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/vpload.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstSimplify/extract-vector.ll
  llvm/test/Transforms/InstSimplify/insert-vector.ll
  llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
  llvm/test/Verifier/extract-vector-mismatched-element-types.ll
  llvm/test/Verifier/insert-extract-intrinsics-invalid.ll
  llvm/test/Verifier/insert-vector-mismatched-element-types.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127976: [IR] Move vector.insert/vector.extract out of experimental namespace

2022-06-21 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 438718.
bsmith added a comment.

- Clarify LangRef slightly to make it clearer that fixed types can be used
- Rebase on top of recent test changes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127976/new/

https://reviews.llvm.org/D127976

Files:
  clang/include/clang/Basic/riscv_vector.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vset.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-vls-arith-ops.c
  clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
  clang/test/CodeGen/aarch64-sve-vls-compare-ops.c
  clang/test/CodeGen/aarch64-sve-vls-shift-ops.c
  clang/test/CodeGen/aarch64-sve-vls-subscript-ops.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_dup_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_get_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_set_neonq.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  llvm/docs/LangRef.rst
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/IR/IRBuilder.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
  llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll.bc
  llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll
  llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
  llvm/test/CodeGen/AArch64/split-vector-insert.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-vector-to-predicate-store.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-extract-subvector.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector-to-predicate-load.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector.ll
  llvm/test/CodeGen/AArch64/sve-no-typesize-warnings.ll
  llvm/test/CodeGen/AArch64/sve-punpklo-combine.ll
  llvm/test/CodeGen/AArch64/sve-vecreduce-fold.ll
  llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/vpload.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstSimplify/extract-vector.ll
  llvm/test/Transforms/InstSimplify/insert-vector.ll
  llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
  llvm/test/Verifier/extract-vector-mismatched-element-types.ll
  llvm/test/Verifier/insert-extract-intrinsics-invalid.ll
  llvm/test/Verifier/insert-vector-mismatched-element-types.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127976: [IR] Move vector.insert/vector.extract out of experimental namespace

2022-06-16 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 437557.
bsmith added a comment.

- Add info to release notes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127976/new/

https://reviews.llvm.org/D127976

Files:
  clang/include/clang/Basic/riscv_vector.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vset.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-vls-arith-ops.c
  clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
  clang/test/CodeGen/aarch64-sve-vls-compare-ops.c
  clang/test/CodeGen/aarch64-sve-vls-shift-ops.c
  clang/test/CodeGen/aarch64-sve-vls-subscript-ops.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_dup_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_get_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_set_neonq.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  llvm/docs/LangRef.rst
  llvm/docs/ReleaseNotes.rst
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/IR/IRBuilder.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
  llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll.bc
  llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll
  llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
  llvm/test/CodeGen/AArch64/split-vector-insert.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-vector-to-predicate-store.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-extract-subvector.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector-to-predicate-load.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector.ll
  llvm/test/CodeGen/AArch64/sve-no-typesize-warnings.ll
  llvm/test/CodeGen/AArch64/sve-punpklo-combine.ll
  llvm/test/CodeGen/AArch64/sve-vecreduce-fold.ll
  llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/vpload.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstSimplify/extract-vector.ll
  llvm/test/Transforms/InstSimplify/insert-vector.ll
  llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
  llvm/test/Verifier/extract-vector-mismatched-element-types.ll
  llvm/test/Verifier/insert-extract-intrinsics-invalid.ll
  llvm/test/Verifier/insert-vector-mismatched-element-types.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127976: [IR] Move vector.insert/vector.extract out of experimental namespace

2022-06-16 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, c-rhodes, sdesmalen.
Herald added subscribers: ctetreau, frasercrmck, jdoerfert, luismarques, 
apazos, sameer.abuasal, s.egerton, Jim, jocewei, PkmX, the_o, brucehoult, 
MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, niosHD, sabuasal, 
simoncook, johnrusso, rbar, asb, hiraditya.
Herald added a project: All.
bsmith requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, pcwang-thead, MaskRay.
Herald added projects: clang, LLVM.

These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D127976

Files:
  clang/include/clang/Basic/riscv_vector.td
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vget.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vlmul.c
  clang/test/CodeGen/RISCV/rvv-intrinsics/vset.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-vls-arith-ops.c
  clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
  clang/test/CodeGen/aarch64-sve-vls-compare-ops.c
  clang/test/CodeGen/aarch64-sve-vls-shift-ops.c
  clang/test/CodeGen/aarch64-sve-vls-subscript-ops.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_dup_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_get_neonq.c
  
clang/test/CodeGen/aarch64_neon_sve_bridge_intrinsics/acle_neon_sve_bridge_set_neonq.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  llvm/docs/LangRef.rst
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/IR/IRBuilder.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
  llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll
  llvm/test/Bitcode/upgrade-vector-insert-extract-intrinsics.ll.bc
  llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll
  llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
  llvm/test/CodeGen/AArch64/split-vector-insert.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-fixed-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
  llvm/test/CodeGen/AArch64/sve-extract-vector-to-predicate-store.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-extract-subvector.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector-to-predicate-load.ll
  llvm/test/CodeGen/AArch64/sve-insert-vector.ll
  llvm/test/CodeGen/AArch64/sve-no-typesize-warnings.ll
  llvm/test/CodeGen/AArch64/sve-punpklo-combine.ll
  llvm/test/CodeGen/AArch64/sve-vecreduce-fold.ll
  llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll
  llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
  llvm/test/CodeGen/RISCV/rvv/vpload.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstSimplify/extract-vector.ll
  llvm/test/Transforms/InstSimplify/insert-vector.ll
  llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
  llvm/test/Verifier/extract-vector-mismatched-element-types.ll
  llvm/test/Verifier/insert-extract-intrinsics-invalid.ll
  llvm/test/Verifier/insert-vector-mismatched-element-types.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122732: [Clang][AArch64][SVE] Allow subscript operator for SVE types

2022-04-08 Thread Bradley Smith via Phabricator via cfe-commits
bsmith accepted this revision.
bsmith added a comment.
This revision is now accepted and ready to land.

LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122732/new/

https://reviews.llvm.org/D122732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120323: [clang][SVE] Add support for arithmetic operators on SVE types

2022-03-04 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a comment.
Herald added a project: All.

This all looks reasonable to me, but I'll let @peterwaller-arm have the final 
say.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120323/new/

https://reviews.llvm.org/D120323

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-18 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a comment.

The tests in this are broken for windows, which I've fixed in 
45e102a173680fd3c90def79a7f0766ed2786ff0 
.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-18 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG26f56438e3da: [Clang][SVE] Properly enable/disable dependant 
SVE target features based upon +… (authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-implied-sve-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -910,11 +910,11 @@
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
  AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
- AArch64::AEK_SVE2 | AArch64::AEK_BF16 |
- AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_PAUTH | AArch64::AEK_MTE |
- AArch64::AEK_SSBS | AArch64::AEK_FP16FML |
- AArch64::AEK_SB,
+ AArch64::AEK_BF16 | AArch64::AEK_I8MM |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_PAUTH |
+ AArch64::AEK_MTE | AArch64::AEK_SSBS |
+ AArch64::AEK_FP16FML | AArch64::AEK_SB,
  "9-A"),
 ARMCPUTestParams("cortex-a57", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_CRC | AArch64::AEK_CRYPTO |
@@ -1030,12 +1030,12 @@
  AArch64::AEK_CRC | AArch64::AEK_FP |
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
- AArch64::AEK_RCPC | AArch64::AEK_SVE2 |
- AArch64::AEK_DOTPROD | AArch64::AEK_MTE |
- AArch64::AEK_PAUTH | AArch64::AEK_I8MM |
- AArch64::AEK_BF16 | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_SSBS | AArch64::AEK_SB |
- AArch64::AEK_FP16FML,
+ AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
+ AArch64::AEK_MTE | AArch64::AEK_PAUTH |
+ AArch64::AEK_I8MM | AArch64::AEK_BF16 |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS |
+ AArch64::AEK_SB | AArch64::AEK_FP16FML,
  "9-A"),
 ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_NONE | AArch64::AEK_CRYPTO |
Index: llvm/include/llvm/Support/AArch64TargetParser.def
===
--- llvm/include/llvm/Support/AArch64TargetParser.def
+++ llvm/include/llvm/Support/AArch64TargetParser.def
@@ -145,9 +145,10 @@
 AARCH64_CPU_NAME("cortex-a55", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_FP16 | AArch64::AEK_DOTPROD | AArch64::AEK_RCPC))
 AARCH64_CPU_NAME("cortex-a510", ARMV9A, FK_NEON_FP_ARMV8, false,
- (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
+ (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SB |
   AArch64::AEK_PAUTH | AArch64::AEK_MTE | AArch64::AEK_SSBS |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("cortex-a57", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_CRC))
 AARCH64_CPU_NAME("cortex-a65", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
@@ -188,8 +189,9 @@
   AArch64::AEK_SSBS))
 AARCH64_CPU_NAME("cortex-x2", ARMV9A, FK_NEON_FP_ARMV8, false,
  (AArch64::AEK_MTE | AArch64::AEK_BF16 | AArch64::AEK_I8MM |
-  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SVE2BITPERM |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SB |
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("neoverse-e1", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_DOTPROD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
 

[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-17 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:82-88
+if (Feature == "sve2")
+  Features.push_back("+sve");
+else if (Feature == "sve2-bitperm" || Feature == "sve2-sha3" ||
+ Feature == "sve2-aes" || Feature == "sve2-sm4") {
+  Features.push_back("+sve");
+  Features.push_back("+sve2");
+} else if (Feature == "nosve") {

sdesmalen wrote:
> ^^^ Are the above changes necessary? i.e. if only `+sve2-sha3` is set as 
> target feature, I thought LLVM automatically infers that it requires `+sve` 
> and `+sve2`. Is that not the case?
That is the case yes, however this code is to catch combinations, for example 
`+sve2-bitperm+nosve2` needs to keep `sve` enabled, if +sve was never in the 
feature set this wouldn't be the case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-17 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 387900.
bsmith added a comment.

- Use more brute force approach to ensure ordering is accounted for
  - This massively simplifies things and removes what was becoming very 
confusing logic
- Add tests for missing cases


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-implied-sve-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -903,11 +903,11 @@
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
  AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
- AArch64::AEK_SVE2 | AArch64::AEK_BF16 |
- AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_PAUTH | AArch64::AEK_MTE |
- AArch64::AEK_SSBS | AArch64::AEK_FP16FML |
- AArch64::AEK_SB,
+ AArch64::AEK_BF16 | AArch64::AEK_I8MM |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_PAUTH |
+ AArch64::AEK_MTE | AArch64::AEK_SSBS |
+ AArch64::AEK_FP16FML | AArch64::AEK_SB,
  "9-A"),
 ARMCPUTestParams("cortex-a57", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_CRC | AArch64::AEK_CRYPTO |
@@ -1012,12 +1012,12 @@
  AArch64::AEK_CRC | AArch64::AEK_FP |
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
- AArch64::AEK_RCPC | AArch64::AEK_SVE2 |
- AArch64::AEK_DOTPROD | AArch64::AEK_MTE |
- AArch64::AEK_PAUTH | AArch64::AEK_I8MM |
- AArch64::AEK_BF16 | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_SSBS | AArch64::AEK_SB |
- AArch64::AEK_FP16FML,
+ AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
+ AArch64::AEK_MTE | AArch64::AEK_PAUTH |
+ AArch64::AEK_I8MM | AArch64::AEK_BF16 |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS |
+ AArch64::AEK_SB | AArch64::AEK_FP16FML,
  "9-A"),
 ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_NONE | AArch64::AEK_CRYPTO |
Index: llvm/include/llvm/Support/AArch64TargetParser.def
===
--- llvm/include/llvm/Support/AArch64TargetParser.def
+++ llvm/include/llvm/Support/AArch64TargetParser.def
@@ -145,9 +145,10 @@
 AARCH64_CPU_NAME("cortex-a55", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_FP16 | AArch64::AEK_DOTPROD | AArch64::AEK_RCPC))
 AARCH64_CPU_NAME("cortex-a510", ARMV9A, FK_NEON_FP_ARMV8, false,
- (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
+ (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SB |
   AArch64::AEK_PAUTH | AArch64::AEK_MTE | AArch64::AEK_SSBS |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("cortex-a57", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_CRC))
 AARCH64_CPU_NAME("cortex-a65", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
@@ -184,8 +185,9 @@
   AArch64::AEK_SSBS))
 AARCH64_CPU_NAME("cortex-x2", ARMV9A, FK_NEON_FP_ARMV8, false,
  (AArch64::AEK_MTE | AArch64::AEK_BF16 | AArch64::AEK_I8MM |
-  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SVE2BITPERM |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SB |
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("neoverse-e1", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_DOTPROD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
  

[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-16 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:73
 static bool DecodeAArch64Features(const Driver , StringRef text,
   std::vector ,
   llvm::AArch64::ArchKind ArchKind) {

sdesmalen wrote:
> does the order of the features matter?
> ```+sve,+nosve => disables sve
> +nosve,+sve => enables sve
> +nosve,+sve2 => enables sve and sve2```
> 
> 
Yes it does, but I believe that is the desired behaviour.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 387212.
bsmith edited the summary of this revision.
bsmith added a comment.
Herald added a subscriber: kristof.beyls.

- Fix duplicate arch feature in unit test
- Use enum class instead of plain enum with typedef


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113776/new/

https://reviews.llvm.org/D113776

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-implied-sve-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -903,11 +903,11 @@
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
  AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
- AArch64::AEK_SVE2 | AArch64::AEK_BF16 |
- AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_PAUTH | AArch64::AEK_MTE |
- AArch64::AEK_SSBS | AArch64::AEK_FP16FML |
- AArch64::AEK_SB,
+ AArch64::AEK_BF16 | AArch64::AEK_I8MM |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_PAUTH |
+ AArch64::AEK_MTE | AArch64::AEK_SSBS |
+ AArch64::AEK_FP16FML | AArch64::AEK_SB,
  "9-A"),
 ARMCPUTestParams("cortex-a57", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_CRC | AArch64::AEK_CRYPTO |
@@ -1012,12 +1012,12 @@
  AArch64::AEK_CRC | AArch64::AEK_FP |
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
- AArch64::AEK_RCPC | AArch64::AEK_SVE2 |
- AArch64::AEK_DOTPROD | AArch64::AEK_MTE |
- AArch64::AEK_PAUTH | AArch64::AEK_I8MM |
- AArch64::AEK_BF16 | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_SSBS | AArch64::AEK_SB |
- AArch64::AEK_FP16FML,
+ AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
+ AArch64::AEK_MTE | AArch64::AEK_PAUTH |
+ AArch64::AEK_I8MM | AArch64::AEK_BF16 |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS |
+ AArch64::AEK_SB | AArch64::AEK_FP16FML,
  "9-A"),
 ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_NONE | AArch64::AEK_CRYPTO |
Index: llvm/include/llvm/Support/AArch64TargetParser.def
===
--- llvm/include/llvm/Support/AArch64TargetParser.def
+++ llvm/include/llvm/Support/AArch64TargetParser.def
@@ -145,9 +145,10 @@
 AARCH64_CPU_NAME("cortex-a55", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_FP16 | AArch64::AEK_DOTPROD | AArch64::AEK_RCPC))
 AARCH64_CPU_NAME("cortex-a510", ARMV9A, FK_NEON_FP_ARMV8, false,
- (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
+ (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SB |
   AArch64::AEK_PAUTH | AArch64::AEK_MTE | AArch64::AEK_SSBS |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("cortex-a57", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_CRC))
 AARCH64_CPU_NAME("cortex-a65", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
@@ -184,8 +185,9 @@
   AArch64::AEK_SSBS))
 AARCH64_CPU_NAME("cortex-x2", ARMV9A, FK_NEON_FP_ARMV8, false,
  (AArch64::AEK_MTE | AArch64::AEK_BF16 | AArch64::AEK_I8MM |
-  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SVE2BITPERM |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SB |
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("neoverse-e1", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_DOTPROD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
 

[PATCH] D113776: [Clang][SVE] Properly enable/disable dependant SVE target features based upon +(no)sve.* options

2021-11-12 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, sdesmalen.
Herald added subscribers: psnobl, tschuett.
Herald added a reviewer: efriedma.
bsmith requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D113776

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-implied-sve-features.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -904,7 +904,8 @@
  AArch64::AEK_LSE | AArch64::AEK_RDM |
  AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
  AArch64::AEK_SVE2 | AArch64::AEK_BF16 |
- AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
+ AArch64::AEK_I8MM | AArch64::AEK_SVE |
+ AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
  AArch64::AEK_PAUTH | AArch64::AEK_MTE |
  AArch64::AEK_SSBS | AArch64::AEK_FP16FML |
  AArch64::AEK_SB,
@@ -1012,12 +1013,12 @@
  AArch64::AEK_CRC | AArch64::AEK_FP |
  AArch64::AEK_SIMD | AArch64::AEK_RAS |
  AArch64::AEK_LSE | AArch64::AEK_RDM |
- AArch64::AEK_RCPC | AArch64::AEK_SVE2 |
- AArch64::AEK_DOTPROD | AArch64::AEK_MTE |
- AArch64::AEK_PAUTH | AArch64::AEK_I8MM |
- AArch64::AEK_BF16 | AArch64::AEK_SVE2BITPERM |
- AArch64::AEK_SSBS | AArch64::AEK_SB |
- AArch64::AEK_FP16FML,
+ AArch64::AEK_RCPC | AArch64::AEK_DOTPROD |
+ AArch64::AEK_MTE | AArch64::AEK_PAUTH |
+ AArch64::AEK_I8MM | AArch64::AEK_BF16 |
+ AArch64::AEK_SVE | AArch64::AEK_SVE2 |
+ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS |
+ AArch64::AEK_SB | AArch64::AEK_FP16FML,
  "9-A"),
 ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8",
  AArch64::AEK_NONE | AArch64::AEK_CRYPTO |
Index: llvm/include/llvm/Support/AArch64TargetParser.def
===
--- llvm/include/llvm/Support/AArch64TargetParser.def
+++ llvm/include/llvm/Support/AArch64TargetParser.def
@@ -145,9 +145,10 @@
 AARCH64_CPU_NAME("cortex-a55", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_FP16 | AArch64::AEK_DOTPROD | AArch64::AEK_RCPC))
 AARCH64_CPU_NAME("cortex-a510", ARMV9A, FK_NEON_FP_ARMV8, false,
- (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SVE2BITPERM |
+ (AArch64::AEK_BF16 | AArch64::AEK_I8MM | AArch64::AEK_SB |
   AArch64::AEK_PAUTH | AArch64::AEK_MTE | AArch64::AEK_SSBS |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("cortex-a57", ARMV8A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_CRC))
 AARCH64_CPU_NAME("cortex-a65", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
@@ -184,8 +185,9 @@
   AArch64::AEK_SSBS))
 AARCH64_CPU_NAME("cortex-x2", ARMV9A, FK_NEON_FP_ARMV8, false,
  (AArch64::AEK_MTE | AArch64::AEK_BF16 | AArch64::AEK_I8MM |
-  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SVE2BITPERM |
-  AArch64::AEK_SB | AArch64::AEK_FP16FML))
+  AArch64::AEK_PAUTH | AArch64::AEK_SSBS | AArch64::AEK_SB |
+  AArch64::AEK_SVE | AArch64::AEK_SVE2 | AArch64::AEK_SVE2BITPERM |
+  AArch64::AEK_FP16FML))
 AARCH64_CPU_NAME("neoverse-e1", ARMV8_2A, FK_CRYPTO_NEON_FP_ARMV8, false,
  (AArch64::AEK_DOTPROD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
   AArch64::AEK_RCPC | AArch64::AEK_SSBS))
Index: clang/test/Driver/aarch64-implied-sve-features.c
===
--- /dev/null
+++ clang/test/Driver/aarch64-implied-sve-features.c
@@ -0,0 +1,65 @@
+// RUN: %clang -target aarch64-linux-gnu -march=armv8-a+sve %s -### |& FileCheck %s --check-prefix=SVE-ONLY
+// SVE-ONLY: "-target-feature" "+sve"
+
+// RUN: %clang -target aarch64-linux-gnu -march=armv8-a+nosve %s 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-25 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0ce46a1d43c6: [AArch64][Driver][SVE] Allow 
-msve-vector-bits=n+ syntax to mean no maximum… (authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-22 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 381511.
bsmith added a comment.

- Don't define SVE target bits macros when vscale min != max
- Add tests for above change
- Use correct (unsigned) version of getAsInteger


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Preprocessor/aarch64-target-features.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-18 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 380330.
bsmith added a comment.

- Avoid side-effects in assertions


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=1024 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 380015.
bsmith added a comment.

- Update sema checking for sve_vector_bits attribute to emit an error when the 
vscale min != max, i.e. when -mvse-vector-bits=+ is used
- Add test to cover the above case


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/Sema/SemaType.cpp:7916
   // The attribute vector size must match -msve-vector-bits.
-  if (VecSize != S.getLangOpts().ArmSveVectorBits) {
+  if (VecSize != S.getLangOpts().VScaleMin * 128) {
 S.Diag(Attr.getLoc(), diag::err_attribute_bad_sve_vector_size)

bsmith wrote:
> paulwalker-arm wrote:
> > I'm thinking there should be a check for `S.getLangOpts().VScaleMin == 
> > S.getLangOpts().VScaleMax` somewhere, but I guess the current diagnostic 
> > probably doesn't account for two values, nor should it really.  What about 
> > just duplicating this block but using `VScaleMax` in place of `VScaleMin`?
> I think I would rather have a separate check for min == max, and output a 
> diagnostic stating that an exact -msve-vector-bits value must be used, does 
> that sound sensible?
> 
> This could then perhaps be combined with the previous comment to produce 
> sensible output, i.e.:
> 
> !vscale_min -> not supported
> vscale_min != vscale_max -> must be fixed value
Actually, now that I re-read the error message, and the fact that is directly 
refers to the valid values, I think instead there should just be an initial 
check of `min == 0 || min != max`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/Sema/SemaType.cpp:7916
   // The attribute vector size must match -msve-vector-bits.
-  if (VecSize != S.getLangOpts().ArmSveVectorBits) {
+  if (VecSize != S.getLangOpts().VScaleMin * 128) {
 S.Diag(Attr.getLoc(), diag::err_attribute_bad_sve_vector_size)

paulwalker-arm wrote:
> I'm thinking there should be a check for `S.getLangOpts().VScaleMin == 
> S.getLangOpts().VScaleMax` somewhere, but I guess the current diagnostic 
> probably doesn't account for two values, nor should it really.  What about 
> just duplicating this block but using `VScaleMax` in place of `VScaleMin`?
I think I would rather have a separate check for min == max, and output a 
diagnostic stating that an exact -msve-vector-bits value must be used, does 
that sound sensible?

This could then perhaps be combined with the previous comment to produce 
sensible output, i.e.:

!vscale_min -> not supported
vscale_min != vscale_max -> must be fixed value


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-14 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 379701.
bsmith added a comment.

- Remove mention of 128-bit chunks from help texts
- Allow any positive integer value for -mvscale-{min,max}, not just powers of 2


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=1024 -flax-vector-conversions=none 

[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-14 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a comment.

In D111790#3063698 , @paulwalker-arm 
wrote:

> Are the references to "128-bit chunks" for the vscale flags necessary?  
> That's really a nuisance of SVE that LLVM IR should not need to worry about.  
> Can we speak exclusively in terms of vscale or is the "multiples of 128" 
> required somewhere?  Perhaps we're missing a target specific convert function 
> from vscale+elt to bytes or something.  Also there's nothing stoping vscale 
> from being 3 (essentially any positive number) but you look to be restricting 
> it to a power of two.

It does feel like we should have something somewhere that specifies the vscale 
chunk as I've had to hardcode 128 in a lot of places (thought they are SVE 
specific places, so I'll remove mention of it from the helptext). As for vscale 
values allowed, if we don't validate the values used in the option itself, 
where would we do it? I guess we could just blindly propagate the value down to 
the IR attribute and let the backends figure out what they want. The issue then 
is providing a nice error, though perhaps we don't need one given they are cc1 
options?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111790/new/

https://reviews.llvm.org/D111790

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111790: [AArch64][Driver][SVE] Allow -msve-vector-bits=+ syntax to mean no maximum vscale

2021-10-14 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, sdesmalen.
Herald added subscribers: ctetreau, dexonsmith, dang, psnobl, kristof.beyls, 
tschuett.
Herald added a reviewer: efriedma.
bsmith requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch splits the existing SveVectorBits LangOpt into VScaleMin and
VScaleMax LangOpts such that we can represent such an option. The cc1
option has also been split into -mvscale-{min,max}= options so that the
cc1 arguments better reflect the vscale_range IR attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D111790

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-types.c
  clang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
  clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
  clang/test/Driver/aarch64-sve-vector-bits.c
  clang/test/Sema/aarch64-sve-explicit-casts-fixed-size.c
  clang/test/Sema/aarch64-sve-lax-vector-conversions.c
  clang/test/Sema/attr-arm-sve-vector-bits.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -msve-vector-bits=512 -fallow-half-arguments-and-returns -Wconversion %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -ffreestanding -fsyntax-only -verify -std=c++11 -mvscale-min=4 -mvscale-max=4 -fallow-half-arguments-and-returns -Wconversion %s
 // expected-no-diagnostics
 
 #include 
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-none %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -mvscale-min=4 -mvscale-max=4 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
 #include 
 
Index: clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
===
--- clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
+++ clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
@@ -1,8 +1,8 @@
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=128 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=256 -flax-vector-conversions=none -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only 

[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-08-04 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe57e1e4e0026: [clang][AArch64][SVE] Avoid going through 
memory for fixed/scalable predicate… (authored by bsmith).

Changed prior to commit:
  https://reviews.llvm.org/D106860?vs=363093=364153#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -49,20 +49,16 @@
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 2
-// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 2, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i8> @llvm.experimental.vector.extract.v2i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-128-NEXT:store <2 x i8> [[CASTFIXEDSVE]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 8
-// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 8, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i8> @llvm.experimental.vector.extract.v8i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-512-NEXT:store <8 x i8> [[CASTFIXEDSVE]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -101,20 +97,16 @@
 
 // CHECK-128-LABEL: @read_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca <2 x i8>, align 2
 // CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* [[SAVED_VALUE]], align 2, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <2 x i8>* [[SAVED_VALUE]] to *
-// CHECK-128-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v2i8( undef, <2 x i8> [[TMP0]], i64 0)
+// CHECK-128-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-128-NEXT:ret  [[TMP1]]
 //
 // CHECK-512-LABEL: @read_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
 // CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <8 x i8>* [[SAVED_VALUE]] to *
-// CHECK-512-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v8i8( undef, <8 x i8> [[TMP0]], i64 0)
+// CHECK-512-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-512-NEXT:ret  [[TMP1]]
 //
 svbool_t read_global_bool() { return global_bool; }
Index: clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
@@ -18,19 +18,15 @@
 // CHECK-NEXT:[[PRED_ADDR:%.*]] = alloca , align 2
 // CHECK-NEXT:[[VEC_ADDR:%.*]] = alloca , align 16
 // CHECK-NEXT:[[PG:%.*]] = alloca , align 2
-// CHECK-NEXT:[[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
-// CHECK-NEXT:[[SAVED_VALUE1:%.*]] = alloca <8 x i8>, align 8
 // 

[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-30 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 363093.
bsmith marked an inline comment as done.
bsmith added a comment.

- Update comment to reflect changes
- Add new test for lax casting via memory


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -49,20 +49,16 @@
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
-// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i8> @llvm.experimental.vector.extract.v2i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-128-NEXT:store <2 x i8> [[CASTFIXEDSVE]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
-// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i8> @llvm.experimental.vector.extract.v8i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-512-NEXT:store <8 x i8> [[CASTFIXEDSVE]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -101,20 +97,16 @@
 
 // CHECK-128-LABEL: @read_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca <2 x i8>, align 16
 // CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* [[SAVED_VALUE]], align 16, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <2 x i8>* [[SAVED_VALUE]] to *
-// CHECK-128-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v2i8( undef, <2 x i8> [[TMP0]], i64 0)
+// CHECK-128-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-128-NEXT:ret  [[TMP1]]
 //
 // CHECK-512-LABEL: @read_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 16
 // CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* [[SAVED_VALUE]], align 16, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <8 x i8>* [[SAVED_VALUE]] to *
-// CHECK-512-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v8i8( undef, <8 x i8> [[TMP0]], i64 0)
+// CHECK-512-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-512-NEXT:ret  [[TMP1]]
 //
 svbool_t read_global_bool() { return global_bool; }
Index: clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
@@ -18,19 +18,15 @@
 // CHECK-NEXT:[[PRED_ADDR:%.*]] = alloca , align 2
 // CHECK-NEXT:[[VEC_ADDR:%.*]] = alloca , align 16
 // CHECK-NEXT:[[PG:%.*]] = alloca , align 2
-// CHECK-NEXT:[[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
-// CHECK-NEXT:[[SAVED_VALUE1:%.*]] = alloca <8 x i8>, align 8
 // CHECK-NEXT:store  [[PRED:%.*]], * [[PRED_ADDR]], align 2
 // CHECK-NEXT:store  [[VEC:%.*]], * [[VEC_ADDR]], align 16
 // CHECK-NEXT:

[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-29 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2102
+  Src = Builder.CreateBitCast(Src, SrcTy);
+}
 if (ScalableSrc->getElementType() == FixedDst->getElementType()) {

junparser wrote:
> I think this may also works for casting between vectors with different 
> element types.
A similar argument applies here as the other related ticket, in principal we 
could, however it's not clear that there is a good use case for writing code 
that would make use of this. So for now it's probably best to just deal with 
predicates which are definitely a problem and other cases as they arise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-28 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2110-2129
 // Perform VLAT <-> VLST bitcast through memory.
 // TODO: since the llvm.experimental.vector.{insert,extract} intrinsics
 //   require the element types of the vectors to be the same, we
 //   need to keep this around for casting between predicates, or more
 //   generally for bitcasts between VLAT <-> VLST where the element
 //   types of the vectors are not the same, until we figure out a 
better
 //   way of doing these casts.

c-rhodes wrote:
> With the predicate casting now using the intrinsics I don't think this is 
> needed any longer. Perhaps we should add an unreachable above if the element 
> type doesn't match?
Don't we still need this for casting between vectors with different element 
types, or are these guaranteed to not hit this code path?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-27 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, eli.friedman, 
junparser.
Herald added subscribers: psnobl, kristof.beyls, tschuett.
Herald added a reviewer: efriedma.
bsmith requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106860

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -49,20 +49,16 @@
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
-// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i8> @llvm.experimental.vector.extract.v2i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-128-NEXT:store <2 x i8> [[CASTFIXEDSVE]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
-// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast  [[V:%.*]] to 
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i8> @llvm.experimental.vector.extract.v8i8.nxv2i8( [[TMP0]], i64 0)
+// CHECK-512-NEXT:store <8 x i8> [[CASTFIXEDSVE]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -101,20 +97,16 @@
 
 // CHECK-128-LABEL: @read_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca <2 x i8>, align 16
 // CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* [[SAVED_VALUE]], align 16, !tbaa [[TBAA6]]
-// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <2 x i8>* [[SAVED_VALUE]] to *
-// CHECK-128-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v2i8( undef, <2 x i8> [[TMP0]], i64 0)
+// CHECK-128-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-128-NEXT:ret  [[TMP1]]
 //
 // CHECK-512-LABEL: @read_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 16
 // CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* [[SAVED_VALUE]], align 16, !tbaa [[TBAA6]]
-// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <8 x i8>* [[SAVED_VALUE]] to *
-// CHECK-512-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call  @llvm.experimental.vector.insert.nxv2i8.v8i8( undef, <8 x i8> [[TMP0]], i64 0)
+// CHECK-512-NEXT:[[TMP1:%.*]] = bitcast  [[CASTFIXEDSVE]] to 
 // CHECK-512-NEXT:ret  [[TMP1]]
 //
 svbool_t read_global_bool() { return global_bool; }
Index: clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
@@ -18,19 +18,15 @@
 // CHECK-NEXT:[[PRED_ADDR:%.*]] = 

[PATCH] D106277: [SVE] Remove the interface for getMaxVScale in favour of the IR attributes

2021-07-19 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:505-506
+  } else if (getContext().getTargetInfo().hasFeature("sve")) {
+CurFn->addFnAttr(
+llvm::Attribute::getWithVScaleRangeArgs(getLLVMContext(), 0, 16));
   }

paulwalker-arm wrote:
> bsmith wrote:
> > Is this really what we want? Won't this enable fixed length codegen all of 
> > the time?
> Fixed length codegen is tied to the minimum `vscale` value, so by using `0` 
> here means nothing is known about the minimum `vscale` and thus fixed length 
> codegen will be restricted to 128bit as is the case when no attribute is 
> specified.
Ah right ok, I'd missed that detail. Ignore me then!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106277/new/

https://reviews.llvm.org/D106277

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106277: [SVE] Remove the interface for getMaxVScale in favour of the IR attributes

2021-07-19 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a reviewer: paulwalker-arm.
bsmith added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:505-506
+  } else if (getContext().getTargetInfo().hasFeature("sve")) {
+CurFn->addFnAttr(
+llvm::Attribute::getWithVScaleRangeArgs(getLLVMContext(), 0, 16));
   }

Is this really what we want? Won't this enable fixed length codegen all of the 
time?



Comment at: llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:119-131
-Optional RISCVTTIImpl::getMaxVScale() const {
-  // There is no assumption of the maximum vector length in V specification.
-  // We use the value specified by users as the maximum vector length.
-  // This function will use the assumed maximum vector length to get the
-  // maximum vscale for LoopVectorizer.
-  // If users do not specify the maximum vector length, we have no way to
-  // know whether the LoopVectorizer is safe to do or not.

I'm not sure that RISCV have made a commitment to use the vscale_range 
attribute yet have they? In either case I think they should be involved in a 
change like this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106277/new/

https://reviews.llvm.org/D106277

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-21 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a comment.

Yup, just committed a fix in ed31ff9c7a9e538ead1fa4feecf09987998621b4 
, sorry 
for the noise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104643: [AArch64][SVE] Add missing target require to test

2021-06-21 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGed31ff9c7a9e: [AArch64][SVE] Add missing target require to 
test (authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104643/new/

https://reviews.llvm.org/D104643

Files:
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c


Index: clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
@@ -2,6 +2,7 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=512  | 
FileCheck %s --check-prefixes=CHECK,CHECK512
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=1024 | 
FileCheck %s --check-prefixes=CHECK,CHECK1024
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=2048 | 
FileCheck %s --check-prefixes=CHECK,CHECK2048
+// REQUIRES: aarch64-registered-target
 
 #include 
 


Index: clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
@@ -2,6 +2,7 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=512  | FileCheck %s --check-prefixes=CHECK,CHECK512
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=1024 | FileCheck %s --check-prefixes=CHECK,CHECK1024
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=2048 | FileCheck %s --check-prefixes=CHECK,CHECK2048
+// REQUIRES: aarch64-registered-target
 
 #include 
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104643: [AArch64][SVE] Add missing target require to test

2021-06-21 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added a reviewer: peterwaller-arm.
Herald added subscribers: psnobl, kristof.beyls, tschuett.
Herald added a reviewer: efriedma.
bsmith requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D104643

Files:
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c


Index: clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
@@ -2,6 +2,7 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=512  | 
FileCheck %s --check-prefixes=CHECK,CHECK512
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=1024 | 
FileCheck %s --check-prefixes=CHECK,CHECK1024
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=2048 | 
FileCheck %s --check-prefixes=CHECK,CHECK2048
+// REQUIRES: aarch64-registered-target
 
 #include 
 


Index: clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
@@ -2,6 +2,7 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=512  | FileCheck %s --check-prefixes=CHECK,CHECK512
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=1024 | FileCheck %s --check-prefixes=CHECK,CHECK1024
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -O2 -S -o - %s -msve-vector-bits=2048 | FileCheck %s --check-prefixes=CHECK,CHECK2048
+// REQUIRES: aarch64-registered-target
 
 #include 
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-21 Thread Bradley Smith via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9e7329e37ede: [AArch64][SVE] Wire up vscale_range attribute 
to SVE min/max vector queries (authored by bsmith).

Changed prior to commit:
  https://reviews.llvm.org/D103702?vs=352101=353345#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

Files:
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
  llvm/test/CodeGen/AArch64/sve-vscale-attr.ll

Index: llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s | FileCheck %s --check-prefixes=CHECK,CHECK-NOARG
+; RUN: llc -aarch64-sve-vector-bits-min=512 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-ARG
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @func_vscale_none(<16 x i32>* %a, <16 x i32>* %b) #0 {
+; CHECK-NOARG-LABEL: func_vscale_none:
+; CHECK-NOARG:   // %bb.0:
+; CHECK-NOARG-NEXT:ldp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ldp q2, q3, [x1]
+; CHECK-NOARG-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NOARG-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NOARG-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NOARG-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NOARG-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NOARG-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NOARG-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NOARG-NEXT:stp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ret
+;
+; CHECK-ARG-LABEL: func_vscale_none:
+; CHECK-ARG:   // %bb.0:
+; CHECK-ARG-NEXT:ptrue p0.s, vl16
+; CHECK-ARG-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-ARG-NEXT:ld1w { z1.s }, p0/z, [x1]
+; CHECK-ARG-NEXT:add z0.s, p0/m, z0.s, z1.s
+; CHECK-ARG-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-ARG-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #0 = { "target-features"="+sve" }
+
+define void @func_vscale1_1(<16 x i32>* %a, <16 x i32>* %b) #1 {
+; CHECK-LABEL: func_vscale1_1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ldp q0, q1, [x0]
+; CHECK-NEXT:ldp q2, q3, [x1]
+; CHECK-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NEXT:stp q0, q1, [x0]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #1 = { "target-features"="+sve" vscale_range(1,1) }
+
+define void @func_vscale2_2(<16 x i32>* %a, <16 x i32>* %b) #2 {
+; CHECK-LABEL: func_vscale2_2:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #2 = { "target-features"="+sve" vscale_range(2,2) }
+
+define void @func_vscale2_4(<16 x i32>* %a, <16 x i32>* %b) #3 {
+; CHECK-LABEL: func_vscale2_4:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #3 = { "target-features"="+sve" vscale_range(2,4) }
+
+define void @func_vscale4_4(<16 x i32>* %a, <16 x i32>* %b) #4 {
+; CHECK-LABEL: func_vscale4_4:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, 

[PATCH] D104539: [Sema][SVE] Properly match builtin ID when using aux target

2021-06-21 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG325b6707942d: [Sema][SVE] Properly match builtin ID when 
using aux target (authored by bsmith).

Changed prior to commit:
  https://reviews.llvm.org/D104539?vs=353002=353340#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104539/new/

https://reviews.llvm.org/D104539

Files:
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/Sema/aarch64-sve-alias-attribute.c


Index: clang/test/Sema/aarch64-sve-alias-attribute.c
===
--- /dev/null
+++ clang/test/Sema/aarch64-sve-alias-attribute.c
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -aux-triple 
aarch64-none-unknown-eabi -target-feature +sve -fopenmp-is-device -fopenmp 
-verify -fsyntax-only %s
+
+static __inline__ 
__attribute__((__clang_arm_builtin_alias(__builtin_sve_svabd_n_f64_m))) // 
expected-no-diagnostics
+void
+nop(void);
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -5163,7 +5163,10 @@
   return ArmBuiltinAliasValid(BuiltinID, AliasName, Map, IntrinNames);
 }
 
-static bool ArmSveAliasValid(unsigned BuiltinID, StringRef AliasName) {
+static bool ArmSveAliasValid(ASTContext , unsigned BuiltinID,
+ StringRef AliasName) {
+  if (Context.BuiltinInfo.isAuxBuiltinID(BuiltinID))
+BuiltinID = Context.BuiltinInfo.getAuxBuiltinID(BuiltinID);
   return BuiltinID >= AArch64::FirstSVEBuiltin &&
  BuiltinID <= AArch64::LastSVEBuiltin;
 }
@@ -5180,7 +5183,7 @@
   StringRef AliasName = cast(D)->getIdentifier()->getName();
 
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (!IsAArch64 && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName))) {
 S.Diag(AL.getLoc(), diag::err_attribute_arm_builtin_alias);
@@ -5210,7 +5213,7 @@
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
   bool IsARM = S.Context.getTargetInfo().getTriple().isARM();
   bool IsRISCV = S.Context.getTargetInfo().getTriple().isRISCV();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (IsARM && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName)) ||
   (IsRISCV && !RISCVAliasValid(BuiltinID, AliasName)) ||


Index: clang/test/Sema/aarch64-sve-alias-attribute.c
===
--- /dev/null
+++ clang/test/Sema/aarch64-sve-alias-attribute.c
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -aux-triple aarch64-none-unknown-eabi -target-feature +sve -fopenmp-is-device -fopenmp -verify -fsyntax-only %s
+
+static __inline__ __attribute__((__clang_arm_builtin_alias(__builtin_sve_svabd_n_f64_m))) // expected-no-diagnostics
+void
+nop(void);
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -5163,7 +5163,10 @@
   return ArmBuiltinAliasValid(BuiltinID, AliasName, Map, IntrinNames);
 }
 
-static bool ArmSveAliasValid(unsigned BuiltinID, StringRef AliasName) {
+static bool ArmSveAliasValid(ASTContext , unsigned BuiltinID,
+ StringRef AliasName) {
+  if (Context.BuiltinInfo.isAuxBuiltinID(BuiltinID))
+BuiltinID = Context.BuiltinInfo.getAuxBuiltinID(BuiltinID);
   return BuiltinID >= AArch64::FirstSVEBuiltin &&
  BuiltinID <= AArch64::LastSVEBuiltin;
 }
@@ -5180,7 +5183,7 @@
   StringRef AliasName = cast(D)->getIdentifier()->getName();
 
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (!IsAArch64 && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName))) {
 S.Diag(AL.getLoc(), diag::err_attribute_arm_builtin_alias);
@@ -5210,7 +5213,7 @@
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
   bool IsARM = S.Context.getTargetInfo().getTriple().isARM();
   bool IsRISCV = S.Context.getTargetInfo().getTriple().isRISCV();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (IsARM && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName)) ||
   (IsRISCV && !RISCVAliasValid(BuiltinID, AliasName)) ||

[PATCH] D104539: [Sema][SVE] Properly match builtin ID when using aux target

2021-06-18 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, joechrisellis, 
sdesmalen.
Herald added subscribers: psnobl, tschuett.
Herald added a reviewer: efriedma.
Herald added a reviewer: aaron.ballman.
bsmith requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D104539

Files:
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/Sema/aarch64-sve-alias-attribute.c


Index: clang/test/Sema/aarch64-sve-alias-attribute.c
===
--- /dev/null
+++ clang/test/Sema/aarch64-sve-alias-attribute.c
@@ -0,0 +1,4 @@
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -aux-triple 
aarch64-none-unknown-eabi -target-feature +sve -fopenmp-is-device -fopenmp 
-verify -fsyntax-only %s
+
+static __inline__ 
__attribute__((__clang_arm_builtin_alias(__builtin_sve_svabd_n_f64_m))) // 
expected-no-diagnostics
+void nop(void);
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -5163,7 +5163,9 @@
   return ArmBuiltinAliasValid(BuiltinID, AliasName, Map, IntrinNames);
 }
 
-static bool ArmSveAliasValid(unsigned BuiltinID, StringRef AliasName) {
+static bool ArmSveAliasValid(ASTContext , unsigned BuiltinID, 
StringRef AliasName) {
+  if (Context.BuiltinInfo.isAuxBuiltinID(BuiltinID))
+BuiltinID = Context.BuiltinInfo.getAuxBuiltinID(BuiltinID);
   return BuiltinID >= AArch64::FirstSVEBuiltin &&
  BuiltinID <= AArch64::LastSVEBuiltin;
 }
@@ -5180,7 +5182,7 @@
   StringRef AliasName = cast(D)->getIdentifier()->getName();
 
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (!IsAArch64 && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName))) {
 S.Diag(AL.getLoc(), diag::err_attribute_arm_builtin_alias);
@@ -5210,7 +5212,7 @@
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
   bool IsARM = S.Context.getTargetInfo().getTriple().isARM();
   bool IsRISCV = S.Context.getTargetInfo().getTriple().isRISCV();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (IsARM && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName)) ||
   (IsRISCV && !RISCVAliasValid(BuiltinID, AliasName)) ||


Index: clang/test/Sema/aarch64-sve-alias-attribute.c
===
--- /dev/null
+++ clang/test/Sema/aarch64-sve-alias-attribute.c
@@ -0,0 +1,4 @@
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -aux-triple aarch64-none-unknown-eabi -target-feature +sve -fopenmp-is-device -fopenmp -verify -fsyntax-only %s
+
+static __inline__ __attribute__((__clang_arm_builtin_alias(__builtin_sve_svabd_n_f64_m))) // expected-no-diagnostics
+void nop(void);
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -5163,7 +5163,9 @@
   return ArmBuiltinAliasValid(BuiltinID, AliasName, Map, IntrinNames);
 }
 
-static bool ArmSveAliasValid(unsigned BuiltinID, StringRef AliasName) {
+static bool ArmSveAliasValid(ASTContext , unsigned BuiltinID, StringRef AliasName) {
+  if (Context.BuiltinInfo.isAuxBuiltinID(BuiltinID))
+BuiltinID = Context.BuiltinInfo.getAuxBuiltinID(BuiltinID);
   return BuiltinID >= AArch64::FirstSVEBuiltin &&
  BuiltinID <= AArch64::LastSVEBuiltin;
 }
@@ -5180,7 +5182,7 @@
   StringRef AliasName = cast(D)->getIdentifier()->getName();
 
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (!IsAArch64 && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName))) {
 S.Diag(AL.getLoc(), diag::err_attribute_arm_builtin_alias);
@@ -5210,7 +5212,7 @@
   bool IsAArch64 = S.Context.getTargetInfo().getTriple().isAArch64();
   bool IsARM = S.Context.getTargetInfo().getTriple().isARM();
   bool IsRISCV = S.Context.getTargetInfo().getTriple().isRISCV();
-  if ((IsAArch64 && !ArmSveAliasValid(BuiltinID, AliasName)) ||
+  if ((IsAArch64 && !ArmSveAliasValid(S.Context, BuiltinID, AliasName)) ||
   (IsARM && !ArmMveAliasValid(BuiltinID, AliasName) &&
!ArmCdeAliasValid(BuiltinID, AliasName)) ||
   (IsRISCV && !RISCVAliasValid(BuiltinID, AliasName)) ||
___
cfe-commits mailing list

[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64Subtarget.h:298-299
+   bool LittleEndian,
+   unsigned MinSVEVectorSizeInBitsOverride = 0,
+   unsigned MaxSVEVectorSizeInBitsOverride = 0);
 

paulwalker-arm wrote:
> Out of interest are these defaults ever relied upon?
Only in the case of manual construction of a subtarget, for example a unit test 
etc. For normal cases this will always be passed in based on the attribute or 
command line options.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 352101.
bsmith marked an inline comment as done.
bsmith added a comment.

- Ensure user input is sanitized for when asserts are not enabled
- Fix clang format issues


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

Files:
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
  llvm/test/CodeGen/AArch64/sve-vscale-attr.ll

Index: llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s | FileCheck %s --check-prefixes=CHECK,CHECK-NOARG
+; RUN: llc -aarch64-sve-vector-bits-min=512 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-ARG
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @func_vscale_none(<16 x i32>* %a, <16 x i32>* %b) #0 {
+; CHECK-NOARG-LABEL: func_vscale_none:
+; CHECK-NOARG:   // %bb.0:
+; CHECK-NOARG-NEXT:ldp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ldp q2, q3, [x1]
+; CHECK-NOARG-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NOARG-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NOARG-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NOARG-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NOARG-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NOARG-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NOARG-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NOARG-NEXT:stp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ret
+;
+; CHECK-ARG-LABEL: func_vscale_none:
+; CHECK-ARG:   // %bb.0:
+; CHECK-ARG-NEXT:ptrue p0.s, vl16
+; CHECK-ARG-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-ARG-NEXT:ld1w { z1.s }, p0/z, [x1]
+; CHECK-ARG-NEXT:add z0.s, p0/m, z0.s, z1.s
+; CHECK-ARG-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-ARG-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #0 = { "target-features"="+sve" }
+
+define void @func_vscale1_1(<16 x i32>* %a, <16 x i32>* %b) #1 {
+; CHECK-LABEL: func_vscale1_1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ldp q0, q1, [x0]
+; CHECK-NEXT:ldp q2, q3, [x1]
+; CHECK-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NEXT:stp q0, q1, [x0]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #1 = { "target-features"="+sve" vscale_range(1,1) }
+
+define void @func_vscale2_2(<16 x i32>* %a, <16 x i32>* %b) #2 {
+; CHECK-LABEL: func_vscale2_2:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #2 = { "target-features"="+sve" vscale_range(2,2) }
+
+define void @func_vscale2_4(<16 x i32>* %a, <16 x i32>* %b) #3 {
+; CHECK-LABEL: func_vscale2_4:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #3 = { "target-features"="+sve" vscale_range(2,4) }
+
+define void @func_vscale4_4(<16 x i32>* %a, <16 x i32>* %b) #4 {
+; CHECK-LABEL: func_vscale4_4:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl16
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, 

[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-14 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64Subtarget.cpp:226
+ : SVEVectorBitsMaxOpt),
   TargetTriple(TT), FrameLowering(),
   InstrInfo(initializeSubtargetDependencies(FS, CPU)), TSInfo(),

sdesmalen wrote:
> nit: Does this need an assert that Min|MaxSVEVectorSizeInBits is zero when 
> SVE is not enabled in the feature string?
In principal you could, however I'm not sure it adds any value, as the 
accessors already assert that SVE is enabled. (And in principal this is a 
generic attribute, not an SVE one).



Comment at: llvm/lib/Target/AArch64/AArch64TargetMachine.cpp:357
+  Attribute VScaleRangeAttr = F.getFnAttribute(Attribute::VScaleRange);
+  if (VScaleRangeAttr.isValid())
+std::tie(MinSVEVectorSize, MaxSVEVectorSize) =

paulwalker-arm wrote:
> I don't know if this is possible but I feel we need a `HasSVE` like check 
> here?
I'm not sure this is really doable here without picking apart the feature 
string, I think it makes more sense to just set the values and assert when 
using the accessors without SVE enabled.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103702: [AArch64][SVE] Wire up vscale_range attribute to SVE min/max vector queries

2021-06-14 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 351910.
bsmith marked 7 inline comments as done.
bsmith added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

- Move attribute/command line logic into AArch64TargetMachine.
- Fix issue with subtarget key appending integers.
- Add end-to-end test.
- Make new subtarget options optional.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103702/new/

https://reviews.llvm.org/D103702

Files:
  clang/test/CodeGen/aarch64-sve-vector-bits-codegen.c
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
  llvm/test/CodeGen/AArch64/sve-vscale-attr.ll

Index: llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-vscale-attr.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s | FileCheck %s --check-prefixes=CHECK,CHECK-NOARG
+; RUN: llc -aarch64-sve-vector-bits-min=512 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-ARG
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @func_vscale_none(<16 x i32>* %a, <16 x i32>* %b) #0 {
+; CHECK-NOARG-LABEL: func_vscale_none:
+; CHECK-NOARG:   // %bb.0:
+; CHECK-NOARG-NEXT:ldp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ldp q2, q3, [x1]
+; CHECK-NOARG-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NOARG-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NOARG-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NOARG-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NOARG-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NOARG-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NOARG-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NOARG-NEXT:stp q0, q1, [x0]
+; CHECK-NOARG-NEXT:ret
+;
+; CHECK-ARG-LABEL: func_vscale_none:
+; CHECK-ARG:   // %bb.0:
+; CHECK-ARG-NEXT:ptrue p0.s, vl16
+; CHECK-ARG-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-ARG-NEXT:ld1w { z1.s }, p0/z, [x1]
+; CHECK-ARG-NEXT:add z0.s, p0/m, z0.s, z1.s
+; CHECK-ARG-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-ARG-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #0 = { "target-features"="+sve" }
+
+define void @func_vscale1_1(<16 x i32>* %a, <16 x i32>* %b) #1 {
+; CHECK-LABEL: func_vscale1_1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ldp q0, q1, [x0]
+; CHECK-NEXT:ldp q2, q3, [x1]
+; CHECK-NEXT:ldp q4, q5, [x0, #32]
+; CHECK-NEXT:ldp q7, q6, [x1, #32]
+; CHECK-NEXT:add v1.4s, v1.4s, v3.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NEXT:add v2.4s, v5.4s, v6.4s
+; CHECK-NEXT:add v3.4s, v4.4s, v7.4s
+; CHECK-NEXT:stp q3, q2, [x0, #32]
+; CHECK-NEXT:stp q0, q1, [x0]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #1 = { "target-features"="+sve" vscale_range(1,1) }
+
+define void @func_vscale2_2(<16 x i32>* %a, <16 x i32>* %b) #2 {
+; CHECK-LABEL: func_vscale2_2:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #2 = { "target-features"="+sve" vscale_range(2,2) }
+
+define void @func_vscale2_4(<16 x i32>* %a, <16 x i32>* %b) #3 {
+; CHECK-LABEL: func_vscale2_4:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p0.s, vl8
+; CHECK-NEXT:add x8, x0, #32 // =32
+; CHECK-NEXT:add x9, x1, #32 // =32
+; CHECK-NEXT:ld1w { z0.s }, p0/z, [x0]
+; CHECK-NEXT:ld1w { z1.s }, p0/z, [x8]
+; CHECK-NEXT:ld1w { z2.s }, p0/z, [x1]
+; CHECK-NEXT:ld1w { z3.s }, p0/z, [x9]
+; CHECK-NEXT:add z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT:add z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT:st1w { z0.s }, p0, [x0]
+; CHECK-NEXT:st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:ret
+  %op1 = load <16 x i32>, <16 x i32>* %a
+  %op2 = load <16 x i32>, <16 x i32>* %b
+  %res = add <16 x i32> %op1, %op2
+  store <16 x i32> %res, <16 x i32>* %a
+  ret void
+}
+
+attributes #3 = { "target-features"="+sve" vscale_range(2,4) }
+
+define void @func_vscale4_4(<16 x i32>* %a, <16 x i32>* %b) #4 {
+; CHECK-LABEL: 

[PATCH] D103082: [AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics

2021-06-07 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG60c9b5f35cae: [AArch64][SVE] Improve codegen for dupq SVE 
ACLE intrinsics (authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103082/new/

https://reviews.llvm.org/D103082

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq_const.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
@@ -0,0 +1,397 @@
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+; DUPQ b8
+
+define  @dupq_b_0() #0 {
+; CHECK-LABEL: @dupq_b_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_d() #0 {
+; CHECK-LABEL: @dupq_b_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_w() #0 {
+; CHECK-LABEL: @dupq_b_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_h() #0 {
+; CHECK-LABEL: @dupq_b_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv8i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_b() #0 {
+; CHECK-LABEL: @dupq_b_b(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+; CHECK-NEXT: ret  %1
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+; DUPQ b16
+
+define  @dupq_h_0() #0 {
+; CHECK-LABEL: @dupq_h_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_d() #0 {
+; CHECK-LABEL: @dupq_h_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_w() 

[PATCH] D103082: [AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics

2021-06-04 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 349827.
bsmith marked an inline comment as done.
bsmith added a comment.

- Remove unnecessary complexity when zero-extending dupq operands into a vector.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103082/new/

https://reviews.llvm.org/D103082

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq_const.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
@@ -0,0 +1,397 @@
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+; DUPQ b8
+
+define  @dupq_b_0() #0 {
+; CHECK-LABEL: @dupq_b_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_d() #0 {
+; CHECK-LABEL: @dupq_b_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_w() #0 {
+; CHECK-LABEL: @dupq_b_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_h() #0 {
+; CHECK-LABEL: @dupq_b_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv8i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_b() #0 {
+; CHECK-LABEL: @dupq_b_b(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+; CHECK-NEXT: ret  %1
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+; DUPQ b16
+
+define  @dupq_h_0() #0 {
+; CHECK-LABEL: @dupq_h_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_d() #0 {
+; CHECK-LABEL: @dupq_h_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_w() #0 {
+; CHECK-LABEL: @dupq_h_w(
+; CHECK: %1 = call  

[PATCH] D103082: [AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics

2021-06-03 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 349525.
bsmith added a comment.

- Use !isZero() in place of getZExtValue() != 0
- Add end to end tests for ptrue transformation


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103082/new/

https://reviews.llvm.org/D103082

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq_const.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
@@ -0,0 +1,397 @@
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+; DUPQ b8
+
+define  @dupq_b_0() #0 {
+; CHECK-LABEL: @dupq_b_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_d() #0 {
+; CHECK-LABEL: @dupq_b_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_w() #0 {
+; CHECK-LABEL: @dupq_b_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_h() #0 {
+; CHECK-LABEL: @dupq_b_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv8i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_b() #0 {
+; CHECK-LABEL: @dupq_b_b(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+; CHECK-NEXT: ret  %1
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+; DUPQ b16
+
+define  @dupq_h_0() #0 {
+; CHECK-LABEL: @dupq_h_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_d() #0 {
+; CHECK-LABEL: @dupq_h_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_w() #0 {
+; CHECK-LABEL: @dupq_h_w(
+; CHECK: %1 = call  

[PATCH] D103082: [AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics

2021-06-02 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 349241.
bsmith retitled this revision from "[AArch64][SVE] Optimize svbool dupq ACLE 
intrinsic to fixed predicate patterns" to "[AArch64][SVE] Improve codegen for 
dupq SVE ACLE intrinsics".
bsmith edited the summary of this revision.
bsmith added a comment.

- Rework approach to use llvm.experimental.vector.insert instead of introducing 
a new LLVM intrinsic
- Also apply changes to all non-svbool variants.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103082/new/

https://reviews.llvm.org/D103082

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
@@ -0,0 +1,397 @@
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+; DUPQ b8
+
+define  @dupq_b_0() #0 {
+; CHECK-LABEL: @dupq_b_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_d() #0 {
+; CHECK-LABEL: @dupq_b_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_w() #0 {
+; CHECK-LABEL: @dupq_b_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_h() #0 {
+; CHECK-LABEL: @dupq_b_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv8i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_b_b() #0 {
+; CHECK-LABEL: @dupq_b_b(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+; CHECK-NEXT: ret  %1
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv16i8.v16i8( undef,
+<16 x i8> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv16i8( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv16i8( %1,  %3,  %4)
+  ret  %5
+}
+
+; DUPQ b16
+
+define  @dupq_h_0() #0 {
+; CHECK-LABEL: @dupq_h_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)
+  %5 = tail call  @llvm.aarch64.sve.cmpne.wide.nxv8i16( %1,  %3,  %4)
+  ret  %5
+}
+
+define  @dupq_h_d() #0 {
+; CHECK-LABEL: @dupq_h_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+  %2 = tail call  @llvm.experimental.vector.insert.nxv8i16.v8i16( undef,
+<8 x i16> , i64 0)
+  %3 = tail call  @llvm.aarch64.sve.dupq.lane.nxv8i16( %2 , i64 0)
+  %4 = tail call  

[PATCH] D103082: [AArch64][SVE] Optimize svbool dupq ACLE intrinsic to fixed predicate patterns

2021-05-25 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, joechrisellis, 
sdesmalen.
Herald added subscribers: psnobl, hiraditya, kristof.beyls, tschuett.
Herald added a reviewer: efriedma.
bsmith requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This patch introduces a new dupq LLVM intrinsic which is emitted upon
encountering the svbool dupq ACLE intrinsics, instead of expanding them
directly.

This allows us to defer the expansion of said intrinsic until much
later when we can reasonably optimize to fixed predicates using ptrue.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D103082

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/CodeGen/AArch64/sve-intrinsics-dupq.ll
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-dupq.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-dupq.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-dupq.ll
@@ -0,0 +1,195 @@
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+; DUPQ b8
+
+define  @dupq_b_0() #0 {
+; CHECK-LABEL: @dupq_b_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.dupq.b8(i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false,
+  i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false)
+  ret  %1
+}
+
+define  @dupq_b_d() #0 {
+; CHECK-LABEL: @dupq_b_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.dupq.b8(i1 true, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false,
+  i1 true, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false)
+  ret  %1
+}
+
+define  @dupq_b_w() #0 {
+; CHECK-LABEL: @dupq_b_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.dupq.b8(i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false,
+  i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false)
+  ret  %1
+}
+
+define  @dupq_b_h() #0 {
+; CHECK-LABEL: @dupq_b_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv8i1( %1)
+; CHECK-NEXT: ret  %2
+  %1 = tail call  @llvm.aarch64.sve.dupq.b8(i1 true, i1 false, i1 true, i1 false, i1 true, i1 false, i1 true, i1 false,
+  i1 true, i1 false, i1 true, i1 false, i1 true, i1 false, i1 true, i1 false)
+  ret  %1
+}
+
+define  @dupq_b_b() #0 {
+; CHECK-LABEL: @dupq_b_b(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+; CHECK-NEXT: ret  %1
+  %1 = tail call  @llvm.aarch64.sve.dupq.b8(i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true,
+  i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true)
+  ret  %1
+}
+
+; DUPQ b16
+
+define  @dupq_h_0() #0 {
+; CHECK-LABEL: @dupq_h_0(
+; CHECK: ret  zeroinitializer
+  %1 = tail call  @llvm.aarch64.sve.dupq.b16(i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false)
+  ret  %1
+}
+
+define  @dupq_h_d() #0 {
+; CHECK-LABEL: @dupq_h_d(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv2i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.dupq.b16(i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false)
+  ret  %1
+}
+
+define  @dupq_h_w() #0 {
+; CHECK-LABEL: @dupq_h_w(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
+; CHECK-NEXT: %2 = call  @llvm.aarch64.sve.convert.to.svbool.nxv4i1( %1)
+; CHECK-NEXT: %3 = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %2)
+; CHECK-NEXT: ret  %3
+  %1 = tail call  @llvm.aarch64.sve.dupq.b16(i1 true, i1 false, i1 true, i1 false, i1 true, i1 false, i1 true, i1 false)
+  ret  %1
+}
+
+define  @dupq_h_h() #0 {
+; CHECK-LABEL: @dupq_h_h(
+; CHECK: %1 = call  @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
+; CHECK-NEXT: ret  

[PATCH] D102623: [CodeGen][AArch64][SVE] Canonicalize intrinsic rdffr{ => _z}

2021-05-19 Thread Bradley Smith via Phabricator via cfe-commits
bsmith accepted this revision.
bsmith added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102623/new/

https://reviews.llvm.org/D102623

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98030: [IR] Add vscale_range IR function attribute

2021-03-22 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG48f5a392cb73: [IR] Add vscale_range IR function attribute 
(authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98030/new/

https://reviews.llvm.org/D98030

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  llvm/docs/BitCodeFormat.rst
  llvm/docs/LangRef.rst
  llvm/include/llvm/Bitcode/LLVMBitCodes.h
  llvm/include/llvm/IR/Attributes.h
  llvm/include/llvm/IR/Attributes.td
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/AsmParser/LLParser.h
  llvm/lib/AsmParser/LLToken.h
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AttributeImpl.h
  llvm/lib/IR/Attributes.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
  llvm/lib/Transforms/Utils/CodeExtractor.cpp
  llvm/test/Bitcode/attributes.ll
  llvm/test/Verifier/vscale_range.ll

Index: llvm/test/Verifier/vscale_range.ll
===
--- /dev/null
+++ llvm/test/Verifier/vscale_range.ll
@@ -0,0 +1,4 @@
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+
+; CHECK: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @b(i32*) vscale_range(8, 1)
Index: llvm/test/Bitcode/attributes.ll
===
--- llvm/test/Bitcode/attributes.ll
+++ llvm/test/Bitcode/attributes.ll
@@ -422,6 +422,31 @@
   ret void
 }
 
+; CHECK: define void @f72() #45
+define void @f72() vscale_range(8)
+{
+  ret void
+}
+
+; CHECK: define void @f73() #46
+define void @f73() vscale_range(1,8)
+{
+  ret void
+}
+
+; CHECK: define void @f74() #47
+define void @f74() vscale_range(1,0)
+{
+  ret void
+}
+
+; CHECK: define void @f75()
+; CHECK-NOT: define void @f75() #
+define void @f75() vscale_range(0,0)
+{
+  ret void
+}
+
 ; CHECK: attributes #0 = { noreturn }
 ; CHECK: attributes #1 = { nounwind }
 ; CHECK: attributes #2 = { readnone }
@@ -467,4 +492,7 @@
 ; CHECK: attributes #42 = { nocallback }
 ; CHECK: attributes #43 = { cold }
 ; CHECK: attributes #44 = { hot }
+; CHECK: attributes #45 = { vscale_range(8,8) }
+; CHECK: attributes #46 = { vscale_range(1,8) }
+; CHECK: attributes #47 = { vscale_range(1,0) }
 ; CHECK: attributes #[[NOBUILTIN]] = { nobuiltin }
Index: llvm/lib/Transforms/Utils/CodeExtractor.cpp
===
--- llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -970,6 +970,7 @@
   case Attribute::StackProtectStrong:
   case Attribute::StrictFP:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NoCfCheck:
   case Attribute::MustProgress:
   case Attribute::NoProfile:
Index: llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
===
--- llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
+++ llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
@@ -73,6 +73,7 @@
   .Case("sspstrong", Attribute::StackProtectStrong)
   .Case("strictfp", Attribute::StrictFP)
   .Case("uwtable", Attribute::UWTable)
+  .Case("vscale_range", Attribute::VScaleRange)
   .Default(Attribute::None);
 }
 
Index: llvm/lib/IR/Verifier.cpp
===
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -1629,6 +1629,7 @@
   case Attribute::InlineHint:
   case Attribute::StackAlignment:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NonLazyBind:
   case Attribute::ReturnsTwice:
   case Attribute::SanitizeAddress:
@@ -1987,6 +1988,14 @@
   return;
   }
 
+  if (Attrs.hasFnAttribute(Attribute::VScaleRange)) {
+std::pair Args =
+Attrs.getVScaleRangeArgs(AttributeList::FunctionIndex);
+
+if (Args.first > Args.second && Args.second != 0)
+  CheckFailed("'vscale_range' minimum cannot be greater than maximum", V);
+  }
+
   if (Attrs.hasFnAttribute("frame-pointer")) {
 StringRef FP = Attrs.getAttribute(AttributeList::FunctionIndex,
   "frame-pointer").getValueAsString();
Index: llvm/lib/IR/Attributes.cpp
===
--- llvm/lib/IR/Attributes.cpp
+++ llvm/lib/IR/Attributes.cpp
@@ -78,6 +78,17 @@
   return std::make_pair(ElemSizeArg, NumElemsArg);
 }
 
+static uint64_t packVScaleRangeArgs(unsigned MinValue, unsigned MaxValue) {
+  return uint64_t(MinValue) << 32 | MaxValue;
+}
+
+static std::pair unpackVScaleRangeArgs(uint64_t Value) {
+  unsigned MaxValue = Value & std::numeric_limits::max();
+  unsigned MinValue = Value >> 32;
+
+  return std::make_pair(MinValue, MaxValue);
+}
+
 Attribute 

[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-17 Thread Bradley Smith via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGcf0da91ba5e1: [AArch64][SVE/NEON] Add support for FROUNDEVEN 
for both NEON and fixed length… (authored by bsmith).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98487/new/

https://reviews.llvm.org/D98487

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-neon-intrinsics.c
  clang/test/CodeGen/aarch64-neon-misc.c
  clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c
  clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c
  clang/test/CodeGen/arm-neon-directed-rounding.c
  clang/test/CodeGen/arm64-vrnd.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/include/llvm/Target/TargetSelectionDAG.td
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/test/CodeGen/AArch64/arm64-vcvt.ll
  llvm/test/CodeGen/AArch64/arm64-vfloatintrinsics.ll
  llvm/test/CodeGen/AArch64/f16-instructions.ll
  llvm/test/CodeGen/AArch64/fp-intrinsics.ll
  llvm/test/CodeGen/AArch64/frintn.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
  llvm/test/CodeGen/AArch64/vec-libcalls.ll

Index: llvm/test/CodeGen/AArch64/vec-libcalls.ll
===
--- llvm/test/CodeGen/AArch64/vec-libcalls.ll
+++ llvm/test/CodeGen/AArch64/vec-libcalls.ll
@@ -29,6 +29,7 @@
 declare <3 x float> @llvm.nearbyint.v3f32(<3 x float>)
 declare <3 x float> @llvm.rint.v3f32(<3 x float>)
 declare <3 x float> @llvm.round.v3f32(<3 x float>)
+declare <3 x float> @llvm.roundeven.v3f32(<3 x float>)
 declare <3 x float> @llvm.sqrt.v3f32(<3 x float>)
 declare <3 x float> @llvm.trunc.v3f32(<3 x float>)
 
@@ -478,6 +479,15 @@
   ret <3 x float> %r
 }
 
+define <3 x float> @roundeven_v3f32(<3 x float> %x) nounwind {
+; CHECK-LABEL: roundeven_v3f32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:frintn v0.4s, v0.4s
+; CHECK-NEXT:ret
+  %r = call <3 x float> @llvm.roundeven.v3f32(<3 x float> %x)
+  ret <3 x float> %r
+}
+
 define <3 x float> @sqrt_v3f32(<3 x float> %x) nounwind {
 ; CHECK-LABEL: sqrt_v3f32:
 ; CHECK:   // %bb.0:
Index: llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
===
--- llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
+++ llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
@@ -1255,6 +1255,253 @@
   ret void
 }
 
+;
+; ROUNDEVEN -> FRINTN
+;
+
+; Don't use SVE for 64-bit vectors.
+define <4 x half> @frintn_v4f16(<4 x half> %op) #0 {
+; CHECK-LABEL: frintn_v4f16:
+; CHECK: frintn v0.4h, v0.4h
+; CHECK-NEXT: ret
+  %res = call <4 x half> @llvm.roundeven.v4f16(<4 x half> %op)
+  ret <4 x half> %res
+}
+
+; Don't use SVE for 128-bit vectors.
+define <8 x half> @frintn_v8f16(<8 x half> %op) #0 {
+; CHECK-LABEL: frintn_v8f16:
+; CHECK: frintn v0.8h, v0.8h
+; CHECK-NEXT: ret
+  %res = call <8 x half> @llvm.roundeven.v8f16(<8 x half> %op)
+  ret <8 x half> %res
+}
+
+define void @frintn_v16f16(<16 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v16f16:
+; CHECK: ptrue [[PG:p[0-9]+]].h, vl16
+; CHECK-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; CHECK-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; CHECK-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; CHECK-NEXT: ret
+  %op = load <16 x half>, <16 x half>* %a
+  %res = call <16 x half> @llvm.roundeven.v16f16(<16 x half> %op)
+  store <16 x half> %res, <16 x half>* %a
+  ret void
+}
+
+define void @frintn_v32f16(<32 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v32f16:
+; VBITS_GE_512: ptrue [[PG:p[0-9]+]].h, vl32
+; VBITS_GE_512-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_512-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_512-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_512-NEXT: ret
+
+; Ensure sensible type legalisation.
+; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+; VBITS_EQ_256-DAG: add x[[A_HI:[0-9]+]], x0, #32
+; VBITS_EQ_256-DAG: ld1h { [[OP_LO:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_EQ_256-DAG: ld1h { [[OP_HI:z[0-9]+]].h }, [[PG]]/z, [x[[A_HI]]]
+; VBITS_EQ_256-DAG: frintn [[RES_LO:z[0-9]+]].h, [[PG]]/m, [[OP_LO]].h
+; VBITS_EQ_256-DAG: frintn [[RES_HI:z[0-9]+]].h, [[PG]]/m, [[OP_HI]].h
+; VBITS_EQ_256-DAG: st1h { [[RES_LO]].h }, [[PG]], [x0]
+; VBITS_EQ_256-DAG: st1h { [[RES_HI]].h }, [[PG]], [x[[A_HI]]
+; VBITS_EQ_256-NEXT: ret
+  %op = load <32 x half>, <32 x half>* %a
+  %res = call <32 x half> @llvm.roundeven.v32f16(<32 x half> %op)
+  store <32 x half> %res, <32 x half>* %a
+  ret void
+}
+
+define void @frintn_v64f16(<64 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v64f16:
+; VBITS_GE_1024: ptrue [[PG:p[0-9]+]].h, vl64
+; VBITS_GE_1024-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_1024-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_1024-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_1024-NEXT: 

[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-16 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 330951.
bsmith added a comment.

- Remove `SDTFPRoundEvenOp` as it's not a correct mirror of `SDTFPRoundOp` 
since that is not for `ISD::FROUND`.
- Fix comments in `include/llvm/Target/TargetSelectionDAG.td` for 
`SDTFPRoundOp` and `SDTFPExtendOp`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98487/new/

https://reviews.llvm.org/D98487

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-neon-intrinsics.c
  clang/test/CodeGen/aarch64-neon-misc.c
  clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c
  clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c
  clang/test/CodeGen/arm-neon-directed-rounding.c
  clang/test/CodeGen/arm64-vrnd.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/include/llvm/Target/TargetSelectionDAG.td
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/test/CodeGen/AArch64/arm64-vcvt.ll
  llvm/test/CodeGen/AArch64/arm64-vfloatintrinsics.ll
  llvm/test/CodeGen/AArch64/f16-instructions.ll
  llvm/test/CodeGen/AArch64/fp-intrinsics.ll
  llvm/test/CodeGen/AArch64/frintn.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
  llvm/test/CodeGen/AArch64/vec-libcalls.ll

Index: llvm/test/CodeGen/AArch64/vec-libcalls.ll
===
--- llvm/test/CodeGen/AArch64/vec-libcalls.ll
+++ llvm/test/CodeGen/AArch64/vec-libcalls.ll
@@ -29,6 +29,7 @@
 declare <3 x float> @llvm.nearbyint.v3f32(<3 x float>)
 declare <3 x float> @llvm.rint.v3f32(<3 x float>)
 declare <3 x float> @llvm.round.v3f32(<3 x float>)
+declare <3 x float> @llvm.roundeven.v3f32(<3 x float>)
 declare <3 x float> @llvm.sqrt.v3f32(<3 x float>)
 declare <3 x float> @llvm.trunc.v3f32(<3 x float>)
 
@@ -478,6 +479,15 @@
   ret <3 x float> %r
 }
 
+define <3 x float> @roundeven_v3f32(<3 x float> %x) nounwind {
+; CHECK-LABEL: roundeven_v3f32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:frintn v0.4s, v0.4s
+; CHECK-NEXT:ret
+  %r = call <3 x float> @llvm.roundeven.v3f32(<3 x float> %x)
+  ret <3 x float> %r
+}
+
 define <3 x float> @sqrt_v3f32(<3 x float> %x) nounwind {
 ; CHECK-LABEL: sqrt_v3f32:
 ; CHECK:   // %bb.0:
Index: llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
===
--- llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
+++ llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
@@ -1255,6 +1255,253 @@
   ret void
 }
 
+;
+; ROUNDEVEN -> FRINTN
+;
+
+; Don't use SVE for 64-bit vectors.
+define <4 x half> @frintn_v4f16(<4 x half> %op) #0 {
+; CHECK-LABEL: frintn_v4f16:
+; CHECK: frintn v0.4h, v0.4h
+; CHECK-NEXT: ret
+  %res = call <4 x half> @llvm.roundeven.v4f16(<4 x half> %op)
+  ret <4 x half> %res
+}
+
+; Don't use SVE for 128-bit vectors.
+define <8 x half> @frintn_v8f16(<8 x half> %op) #0 {
+; CHECK-LABEL: frintn_v8f16:
+; CHECK: frintn v0.8h, v0.8h
+; CHECK-NEXT: ret
+  %res = call <8 x half> @llvm.roundeven.v8f16(<8 x half> %op)
+  ret <8 x half> %res
+}
+
+define void @frintn_v16f16(<16 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v16f16:
+; CHECK: ptrue [[PG:p[0-9]+]].h, vl16
+; CHECK-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; CHECK-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; CHECK-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; CHECK-NEXT: ret
+  %op = load <16 x half>, <16 x half>* %a
+  %res = call <16 x half> @llvm.roundeven.v16f16(<16 x half> %op)
+  store <16 x half> %res, <16 x half>* %a
+  ret void
+}
+
+define void @frintn_v32f16(<32 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v32f16:
+; VBITS_GE_512: ptrue [[PG:p[0-9]+]].h, vl32
+; VBITS_GE_512-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_512-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_512-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_512-NEXT: ret
+
+; Ensure sensible type legalisation.
+; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+; VBITS_EQ_256-DAG: add x[[A_HI:[0-9]+]], x0, #32
+; VBITS_EQ_256-DAG: ld1h { [[OP_LO:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_EQ_256-DAG: ld1h { [[OP_HI:z[0-9]+]].h }, [[PG]]/z, [x[[A_HI]]]
+; VBITS_EQ_256-DAG: frintn [[RES_LO:z[0-9]+]].h, [[PG]]/m, [[OP_LO]].h
+; VBITS_EQ_256-DAG: frintn [[RES_HI:z[0-9]+]].h, [[PG]]/m, [[OP_HI]].h
+; VBITS_EQ_256-DAG: st1h { [[RES_LO]].h }, [[PG]], [x0]
+; VBITS_EQ_256-DAG: st1h { [[RES_HI]].h }, [[PG]], [x[[A_HI]]
+; VBITS_EQ_256-NEXT: ret
+  %op = load <32 x half>, <32 x half>* %a
+  %res = call <32 x half> @llvm.roundeven.v32f16(<32 x half> %op)
+  store <32 x half> %res, <32 x half>* %a
+  ret void
+}
+
+define void @frintn_v64f16(<64 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v64f16:
+; VBITS_GE_1024: ptrue [[PG:p[0-9]+]].h, vl64
+; VBITS_GE_1024-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_1024-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_1024-NEXT: st1h { [[RES]].h }, [[PG]], [x0]

[PATCH] D98030: [IR] Add vscale_range IR function attribute

2021-03-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 330691.
bsmith marked 3 inline comments as done.
bsmith added a comment.

- Prevent vscale_range(0,0) from crashing and instead don't add the attribute
- Improve CHECK lines in arm-sve-vector-bits-vscale-range.c test
- Test vscale_range(0,0) case and move some tests around


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98030/new/

https://reviews.llvm.org/D98030

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  llvm/docs/BitCodeFormat.rst
  llvm/docs/LangRef.rst
  llvm/include/llvm/Bitcode/LLVMBitCodes.h
  llvm/include/llvm/IR/Attributes.h
  llvm/include/llvm/IR/Attributes.td
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/AsmParser/LLParser.h
  llvm/lib/AsmParser/LLToken.h
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AttributeImpl.h
  llvm/lib/IR/Attributes.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
  llvm/lib/Transforms/Utils/CodeExtractor.cpp
  llvm/test/Bitcode/attributes.ll
  llvm/test/Verifier/vscale_range.ll

Index: llvm/test/Verifier/vscale_range.ll
===
--- /dev/null
+++ llvm/test/Verifier/vscale_range.ll
@@ -0,0 +1,4 @@
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+
+; CHECK: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @b(i32*) vscale_range(8, 1)
Index: llvm/test/Bitcode/attributes.ll
===
--- llvm/test/Bitcode/attributes.ll
+++ llvm/test/Bitcode/attributes.ll
@@ -422,6 +422,31 @@
   ret void
 }
 
+; CHECK: define void @f72() #45
+define void @f72() vscale_range(8)
+{
+  ret void
+}
+
+; CHECK: define void @f73() #46
+define void @f73() vscale_range(1,8)
+{
+  ret void
+}
+
+; CHECK: define void @f74() #47
+define void @f74() vscale_range(1,0)
+{
+  ret void
+}
+
+; CHECK: define void @f75()
+; CHECK-NOT: define void @f75() #
+define void @f75() vscale_range(0,0)
+{
+  ret void
+}
+
 ; CHECK: attributes #0 = { noreturn }
 ; CHECK: attributes #1 = { nounwind }
 ; CHECK: attributes #2 = { readnone }
@@ -467,4 +492,7 @@
 ; CHECK: attributes #42 = { nocallback }
 ; CHECK: attributes #43 = { cold }
 ; CHECK: attributes #44 = { hot }
+; CHECK: attributes #45 = { vscale_range(8,8) }
+; CHECK: attributes #46 = { vscale_range(1,8) }
+; CHECK: attributes #47 = { vscale_range(1,0) }
 ; CHECK: attributes #[[NOBUILTIN]] = { nobuiltin }
Index: llvm/lib/Transforms/Utils/CodeExtractor.cpp
===
--- llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -970,6 +970,7 @@
   case Attribute::StackProtectStrong:
   case Attribute::StrictFP:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NoCfCheck:
   case Attribute::MustProgress:
   case Attribute::NoProfile:
Index: llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
===
--- llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
+++ llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
@@ -73,6 +73,7 @@
   .Case("sspstrong", Attribute::StackProtectStrong)
   .Case("strictfp", Attribute::StrictFP)
   .Case("uwtable", Attribute::UWTable)
+  .Case("vscale_range", Attribute::VScaleRange)
   .Default(Attribute::None);
 }
 
Index: llvm/lib/IR/Verifier.cpp
===
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -1629,6 +1629,7 @@
   case Attribute::InlineHint:
   case Attribute::StackAlignment:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NonLazyBind:
   case Attribute::ReturnsTwice:
   case Attribute::SanitizeAddress:
@@ -1987,6 +1988,14 @@
   return;
   }
 
+  if (Attrs.hasFnAttribute(Attribute::VScaleRange)) {
+std::pair Args =
+Attrs.getVScaleRangeArgs(AttributeList::FunctionIndex);
+
+if (Args.first > Args.second && Args.second != 0)
+  CheckFailed("'vscale_range' minimum cannot be greater than maximum", V);
+  }
+
   if (Attrs.hasFnAttribute("frame-pointer")) {
 StringRef FP = Attrs.getAttribute(AttributeList::FunctionIndex,
   "frame-pointer").getValueAsString();
Index: llvm/lib/IR/Attributes.cpp
===
--- llvm/lib/IR/Attributes.cpp
+++ llvm/lib/IR/Attributes.cpp
@@ -78,6 +78,17 @@
   return std::make_pair(ElemSizeArg, NumElemsArg);
 }
 
+static uint64_t packVScaleRangeArgs(unsigned MinValue, unsigned MaxValue) {
+  return uint64_t(MinValue) << 32 | MaxValue;
+}
+
+static std::pair unpackVScaleRangeArgs(uint64_t Value) {
+  unsigned MaxValue = Value & std::numeric_limits::max();
+  unsigned MinValue 

[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: llvm/include/llvm/Target/TargetSelectionDAG.td:158
 ]>;
+def SDTFPRoundEvenOp  : SDTypeProfile<1, 1, [   // froundeven
+  SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<0, 1>, 
SDTCisSameNumEltsAs<0, 1>

dmgreen wrote:
> Is this used? The one above should maybe say `// fpround`?
No it's not, I added it for consistency, but perhaps I shouldn't? I think 
fround is correct for the one above, or at least is consistent with the others 
in this file, for example fextend below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98487/new/

https://reviews.llvm.org/D98487

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 330629.
bsmith added a comment.
Herald added a subscriber: dexonsmith.

- Add AutoUpgrade code to convert aarch64.neon.frintn to roundeven
- Add test for above AutoUpgrade


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98487/new/

https://reviews.llvm.org/D98487

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-neon-intrinsics.c
  clang/test/CodeGen/aarch64-neon-misc.c
  clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c
  clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c
  clang/test/CodeGen/arm-neon-directed-rounding.c
  clang/test/CodeGen/arm64-vrnd.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/include/llvm/Target/TargetSelectionDAG.td
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/test/CodeGen/AArch64/arm64-vcvt.ll
  llvm/test/CodeGen/AArch64/arm64-vfloatintrinsics.ll
  llvm/test/CodeGen/AArch64/f16-instructions.ll
  llvm/test/CodeGen/AArch64/fp-intrinsics.ll
  llvm/test/CodeGen/AArch64/frintn.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
  llvm/test/CodeGen/AArch64/vec-libcalls.ll

Index: llvm/test/CodeGen/AArch64/vec-libcalls.ll
===
--- llvm/test/CodeGen/AArch64/vec-libcalls.ll
+++ llvm/test/CodeGen/AArch64/vec-libcalls.ll
@@ -29,6 +29,7 @@
 declare <3 x float> @llvm.nearbyint.v3f32(<3 x float>)
 declare <3 x float> @llvm.rint.v3f32(<3 x float>)
 declare <3 x float> @llvm.round.v3f32(<3 x float>)
+declare <3 x float> @llvm.roundeven.v3f32(<3 x float>)
 declare <3 x float> @llvm.sqrt.v3f32(<3 x float>)
 declare <3 x float> @llvm.trunc.v3f32(<3 x float>)
 
@@ -478,6 +479,15 @@
   ret <3 x float> %r
 }
 
+define <3 x float> @roundeven_v3f32(<3 x float> %x) nounwind {
+; CHECK-LABEL: roundeven_v3f32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:frintn v0.4s, v0.4s
+; CHECK-NEXT:ret
+  %r = call <3 x float> @llvm.roundeven.v3f32(<3 x float> %x)
+  ret <3 x float> %r
+}
+
 define <3 x float> @sqrt_v3f32(<3 x float> %x) nounwind {
 ; CHECK-LABEL: sqrt_v3f32:
 ; CHECK:   // %bb.0:
Index: llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
===
--- llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
+++ llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
@@ -1255,6 +1255,253 @@
   ret void
 }
 
+;
+; ROUNDEVEN -> FRINTN
+;
+
+; Don't use SVE for 64-bit vectors.
+define <4 x half> @frintn_v4f16(<4 x half> %op) #0 {
+; CHECK-LABEL: frintn_v4f16:
+; CHECK: frintn v0.4h, v0.4h
+; CHECK-NEXT: ret
+  %res = call <4 x half> @llvm.roundeven.v4f16(<4 x half> %op)
+  ret <4 x half> %res
+}
+
+; Don't use SVE for 128-bit vectors.
+define <8 x half> @frintn_v8f16(<8 x half> %op) #0 {
+; CHECK-LABEL: frintn_v8f16:
+; CHECK: frintn v0.8h, v0.8h
+; CHECK-NEXT: ret
+  %res = call <8 x half> @llvm.roundeven.v8f16(<8 x half> %op)
+  ret <8 x half> %res
+}
+
+define void @frintn_v16f16(<16 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v16f16:
+; CHECK: ptrue [[PG:p[0-9]+]].h, vl16
+; CHECK-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; CHECK-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; CHECK-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; CHECK-NEXT: ret
+  %op = load <16 x half>, <16 x half>* %a
+  %res = call <16 x half> @llvm.roundeven.v16f16(<16 x half> %op)
+  store <16 x half> %res, <16 x half>* %a
+  ret void
+}
+
+define void @frintn_v32f16(<32 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v32f16:
+; VBITS_GE_512: ptrue [[PG:p[0-9]+]].h, vl32
+; VBITS_GE_512-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_512-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_512-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_512-NEXT: ret
+
+; Ensure sensible type legalisation.
+; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+; VBITS_EQ_256-DAG: add x[[A_HI:[0-9]+]], x0, #32
+; VBITS_EQ_256-DAG: ld1h { [[OP_LO:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_EQ_256-DAG: ld1h { [[OP_HI:z[0-9]+]].h }, [[PG]]/z, [x[[A_HI]]]
+; VBITS_EQ_256-DAG: frintn [[RES_LO:z[0-9]+]].h, [[PG]]/m, [[OP_LO]].h
+; VBITS_EQ_256-DAG: frintn [[RES_HI:z[0-9]+]].h, [[PG]]/m, [[OP_HI]].h
+; VBITS_EQ_256-DAG: st1h { [[RES_LO]].h }, [[PG]], [x0]
+; VBITS_EQ_256-DAG: st1h { [[RES_HI]].h }, [[PG]], [x[[A_HI]]
+; VBITS_EQ_256-NEXT: ret
+  %op = load <32 x half>, <32 x half>* %a
+  %res = call <32 x half> @llvm.roundeven.v32f16(<32 x half> %op)
+  store <32 x half> %res, <32 x half>* %a
+  ret void
+}
+
+define void @frintn_v64f16(<64 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v64f16:
+; VBITS_GE_1024: ptrue [[PG:p[0-9]+]].h, vl64
+; VBITS_GE_1024-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_1024-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_1024-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_1024-NEXT: ret
+  %op = load <64 x half>, <64 x half>* %a
+  %res 

[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added a comment.

> Why is this patch only changing int_aarch64_neon_frintn and not 
> int_aarch64_sve_frintn?
> Is there a particular reason to do so?

Things are done slightly differently for SVE in this regard, in principal yes, 
we could emit roundeven instead of frintn from the ACLE intrinsic, however all 
of the other ACLE intrinsics also emit SVE specific LLVM intrinsics rather than 
the arch-indep nodes. This patch doesn't change that in order to stay 
consistent, if we did want to change that it should be done as a separate patch 
that changes all of them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98487/new/

https://reviews.llvm.org/D98487

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98030: [IR] Add vscale_range IR function attribute

2021-03-12 Thread Bradley Smith via Phabricator via cfe-commits
bsmith added inline comments.



Comment at: llvm/lib/IR/Attributes.cpp:570
+Result += utostr(MinValue);
+Result += ',';
+Result += utostr(MaxValue);

peterwaller-arm wrote:
> Nit: The only other precedent I can see for this is `allocsize`. Grepping the 
> code I found this is always written in tests as `allocsize(x, y)`, however, 
> it prints as `allocsize(x,y)` (no space) when done with `-emit-llvm`, 
> regardless of how it was formatted as input. I figure that is an oversight 
> and this should have the space.
I'm reluctant to change `allocsize` and given that is the only other example of 
multiple parameters to an attribute like this, I'm inclined to stay consistent 
with it. I also think there is an argument to be made about not intruding extra 
spaces in a potentially already cluttered attributes section.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98030/new/

https://reviews.llvm.org/D98030

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D98030: [IR] Add vscale_range IR function attribute

2021-03-12 Thread Bradley Smith via Phabricator via cfe-commits
bsmith updated this revision to Diff 330205.
bsmith marked 3 inline comments as done.
bsmith added a comment.

- State what lack of vscale_range attribute means in LangRef
- Minor formatting change


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98030/new/

https://reviews.llvm.org/D98030

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  llvm/docs/BitCodeFormat.rst
  llvm/docs/LangRef.rst
  llvm/include/llvm/Bitcode/LLVMBitCodes.h
  llvm/include/llvm/IR/Attributes.h
  llvm/include/llvm/IR/Attributes.td
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/AsmParser/LLParser.h
  llvm/lib/AsmParser/LLToken.h
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AttributeImpl.h
  llvm/lib/IR/Attributes.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
  llvm/lib/Transforms/Utils/CodeExtractor.cpp
  llvm/test/Bitcode/attributes.ll
  llvm/test/Verifier/vscale_range.ll

Index: llvm/test/Verifier/vscale_range.ll
===
--- /dev/null
+++ llvm/test/Verifier/vscale_range.ll
@@ -0,0 +1,7 @@
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+
+; CHECK-NOT: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @a(i32) vscale_range(1, 0)
+
+; CHECK: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @b(i32*) vscale_range(8, 1)
Index: llvm/test/Bitcode/attributes.ll
===
--- llvm/test/Bitcode/attributes.ll
+++ llvm/test/Bitcode/attributes.ll
@@ -422,6 +422,18 @@
   ret void
 }
 
+; CHECK: define void @f72() #45
+define void @f72() vscale_range(8)
+{
+  ret void
+}
+
+; CHECK: define void @f73() #46
+define void @f73() vscale_range(1,8)
+{
+  ret void
+}
+
 ; CHECK: attributes #0 = { noreturn }
 ; CHECK: attributes #1 = { nounwind }
 ; CHECK: attributes #2 = { readnone }
@@ -467,4 +479,6 @@
 ; CHECK: attributes #42 = { nocallback }
 ; CHECK: attributes #43 = { cold }
 ; CHECK: attributes #44 = { hot }
+; CHECK: attributes #45 = { vscale_range(8,8) }
+; CHECK: attributes #46 = { vscale_range(1,8) }
 ; CHECK: attributes #[[NOBUILTIN]] = { nobuiltin }
Index: llvm/lib/Transforms/Utils/CodeExtractor.cpp
===
--- llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -970,6 +970,7 @@
   case Attribute::StackProtectStrong:
   case Attribute::StrictFP:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NoCfCheck:
   case Attribute::MustProgress:
   case Attribute::NoProfile:
Index: llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
===
--- llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
+++ llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
@@ -73,6 +73,7 @@
   .Case("sspstrong", Attribute::StackProtectStrong)
   .Case("strictfp", Attribute::StrictFP)
   .Case("uwtable", Attribute::UWTable)
+  .Case("vscale_range", Attribute::VScaleRange)
   .Default(Attribute::None);
 }
 
Index: llvm/lib/IR/Verifier.cpp
===
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -1629,6 +1629,7 @@
   case Attribute::InlineHint:
   case Attribute::StackAlignment:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NonLazyBind:
   case Attribute::ReturnsTwice:
   case Attribute::SanitizeAddress:
@@ -1987,6 +1988,14 @@
   return;
   }
 
+  if (Attrs.hasFnAttribute(Attribute::VScaleRange)) {
+std::pair Args =
+Attrs.getVScaleRangeArgs(AttributeList::FunctionIndex);
+
+if (Args.first > Args.second && Args.second != 0)
+  CheckFailed("'vscale_range' minimum cannot be greater than maximum", V);
+  }
+
   if (Attrs.hasFnAttribute("frame-pointer")) {
 StringRef FP = Attrs.getAttribute(AttributeList::FunctionIndex,
   "frame-pointer").getValueAsString();
Index: llvm/lib/IR/Attributes.cpp
===
--- llvm/lib/IR/Attributes.cpp
+++ llvm/lib/IR/Attributes.cpp
@@ -78,6 +78,17 @@
   return std::make_pair(ElemSizeArg, NumElemsArg);
 }
 
+static uint64_t packVScaleRangeArgs(unsigned MinValue, unsigned MaxValue) {
+  return uint64_t(MinValue) << 32 | MaxValue;
+}
+
+static std::pair unpackVScaleRangeArgs(uint64_t Value) {
+  unsigned MaxValue = Value & std::numeric_limits::max();
+  unsigned MinValue = Value >> 32;
+
+  return std::make_pair(MinValue, MaxValue);
+}
+
 Attribute Attribute::get(LLVMContext , Attribute::AttrKind Kind,
  uint64_t Val) {
   LLVMContextImpl *pImpl = Context.pImpl;
@@ -192,6 +203,12 @@
   return get(Context, 

[PATCH] D98487: [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE

2021-03-12 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, peterwaller-arm, joechrisellis, 
CarolineConcatto, dmgreen.
Herald added subscribers: hiraditya, kristof.beyls, tschuett.
bsmith requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D98487

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-neon-intrinsics.c
  clang/test/CodeGen/aarch64-neon-misc.c
  clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c
  clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c
  clang/test/CodeGen/arm-neon-directed-rounding.c
  clang/test/CodeGen/arm64-vrnd.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/include/llvm/Target/TargetSelectionDAG.td
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/test/CodeGen/AArch64/arm64-vcvt.ll
  llvm/test/CodeGen/AArch64/arm64-vfloatintrinsics.ll
  llvm/test/CodeGen/AArch64/f16-instructions.ll
  llvm/test/CodeGen/AArch64/fp-intrinsics.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
  llvm/test/CodeGen/AArch64/vec-libcalls.ll

Index: llvm/test/CodeGen/AArch64/vec-libcalls.ll
===
--- llvm/test/CodeGen/AArch64/vec-libcalls.ll
+++ llvm/test/CodeGen/AArch64/vec-libcalls.ll
@@ -29,6 +29,7 @@
 declare <3 x float> @llvm.nearbyint.v3f32(<3 x float>)
 declare <3 x float> @llvm.rint.v3f32(<3 x float>)
 declare <3 x float> @llvm.round.v3f32(<3 x float>)
+declare <3 x float> @llvm.roundeven.v3f32(<3 x float>)
 declare <3 x float> @llvm.sqrt.v3f32(<3 x float>)
 declare <3 x float> @llvm.trunc.v3f32(<3 x float>)
 
@@ -478,6 +479,15 @@
   ret <3 x float> %r
 }
 
+define <3 x float> @roundeven_v3f32(<3 x float> %x) nounwind {
+; CHECK-LABEL: roundeven_v3f32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:frintn v0.4s, v0.4s
+; CHECK-NEXT:ret
+  %r = call <3 x float> @llvm.roundeven.v3f32(<3 x float> %x)
+  ret <3 x float> %r
+}
+
 define <3 x float> @sqrt_v3f32(<3 x float> %x) nounwind {
 ; CHECK-LABEL: sqrt_v3f32:
 ; CHECK:   // %bb.0:
Index: llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
===
--- llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
+++ llvm/test/CodeGen/AArch64/sve-fixed-length-fp-rounding.ll
@@ -1255,6 +1255,253 @@
   ret void
 }
 
+;
+; ROUNDEVEN -> FRINTN
+;
+
+; Don't use SVE for 64-bit vectors.
+define <4 x half> @frintn_v4f16(<4 x half> %op) #0 {
+; CHECK-LABEL: frintn_v4f16:
+; CHECK: frintn v0.4h, v0.4h
+; CHECK-NEXT: ret
+  %res = call <4 x half> @llvm.roundeven.v4f16(<4 x half> %op)
+  ret <4 x half> %res
+}
+
+; Don't use SVE for 128-bit vectors.
+define <8 x half> @frintn_v8f16(<8 x half> %op) #0 {
+; CHECK-LABEL: frintn_v8f16:
+; CHECK: frintn v0.8h, v0.8h
+; CHECK-NEXT: ret
+  %res = call <8 x half> @llvm.roundeven.v8f16(<8 x half> %op)
+  ret <8 x half> %res
+}
+
+define void @frintn_v16f16(<16 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v16f16:
+; CHECK: ptrue [[PG:p[0-9]+]].h, vl16
+; CHECK-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; CHECK-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; CHECK-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; CHECK-NEXT: ret
+  %op = load <16 x half>, <16 x half>* %a
+  %res = call <16 x half> @llvm.roundeven.v16f16(<16 x half> %op)
+  store <16 x half> %res, <16 x half>* %a
+  ret void
+}
+
+define void @frintn_v32f16(<32 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v32f16:
+; VBITS_GE_512: ptrue [[PG:p[0-9]+]].h, vl32
+; VBITS_GE_512-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_GE_512-NEXT: frintn [[RES:z[0-9]+]].h, [[PG]]/m, [[OP]].h
+; VBITS_GE_512-NEXT: st1h { [[RES]].h }, [[PG]], [x0]
+; VBITS_GE_512-NEXT: ret
+
+; Ensure sensible type legalisation.
+; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+; VBITS_EQ_256-DAG: add x[[A_HI:[0-9]+]], x0, #32
+; VBITS_EQ_256-DAG: ld1h { [[OP_LO:z[0-9]+]].h }, [[PG]]/z, [x0]
+; VBITS_EQ_256-DAG: ld1h { [[OP_HI:z[0-9]+]].h }, [[PG]]/z, [x[[A_HI]]]
+; VBITS_EQ_256-DAG: frintn [[RES_LO:z[0-9]+]].h, [[PG]]/m, [[OP_LO]].h
+; VBITS_EQ_256-DAG: frintn [[RES_HI:z[0-9]+]].h, [[PG]]/m, [[OP_HI]].h
+; VBITS_EQ_256-DAG: st1h { [[RES_LO]].h }, [[PG]], [x0]
+; VBITS_EQ_256-DAG: st1h { [[RES_HI]].h }, [[PG]], [x[[A_HI]]
+; VBITS_EQ_256-NEXT: ret
+  %op = load <32 x half>, <32 x half>* %a
+  %res = call <32 x half> @llvm.roundeven.v32f16(<32 x half> %op)
+  store <32 x half> %res, <32 x half>* %a
+  ret void
+}
+
+define void @frintn_v64f16(<64 x half>* %a) #0 {
+; CHECK-LABEL: frintn_v64f16:
+; VBITS_GE_1024: ptrue [[PG:p[0-9]+]].h, vl64
+; VBITS_GE_1024-DAG: ld1h { [[OP:z[0-9]+]].h }, [[PG]]/z, [x0]
+; 

[PATCH] D98030: [IR] Add vscale_range IR function attribute

2021-03-05 Thread Bradley Smith via Phabricator via cfe-commits
bsmith created this revision.
bsmith added reviewers: paulwalker-arm, joechrisellis, peterwaller-arm.
Herald added subscribers: dexonsmith, jdoerfert, steven_wu, hiraditya.
bsmith requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.

Additionally hook up the -msve-vector-bits= clang option to emit this
attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D98030

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGen/arm-sve-vector-bits-vscale-range.c
  llvm/docs/BitCodeFormat.rst
  llvm/docs/LangRef.rst
  llvm/include/llvm/Bitcode/LLVMBitCodes.h
  llvm/include/llvm/IR/Attributes.h
  llvm/include/llvm/IR/Attributes.td
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/AsmParser/LLParser.h
  llvm/lib/AsmParser/LLToken.h
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AttributeImpl.h
  llvm/lib/IR/Attributes.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
  llvm/lib/Transforms/Utils/CodeExtractor.cpp
  llvm/test/Bitcode/attributes.ll
  llvm/test/Verifier/vscale_range.ll

Index: llvm/test/Verifier/vscale_range.ll
===
--- /dev/null
+++ llvm/test/Verifier/vscale_range.ll
@@ -0,0 +1,7 @@
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+
+; CHECK-NOT: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @a(i32) vscale_range(1, 0)
+
+; CHECK: 'vscale_range' minimum cannot be greater than maximum
+declare i8* @b(i32*) vscale_range(8, 1)
Index: llvm/test/Bitcode/attributes.ll
===
--- llvm/test/Bitcode/attributes.ll
+++ llvm/test/Bitcode/attributes.ll
@@ -422,6 +422,18 @@
   ret void
 }
 
+; CHECK: define void @f72() #45
+define void @f72() vscale_range(8)
+{
+  ret void
+}
+
+; CHECK: define void @f73() #46
+define void @f73() vscale_range(1,8)
+{
+  ret void
+}
+
 ; CHECK: attributes #0 = { noreturn }
 ; CHECK: attributes #1 = { nounwind }
 ; CHECK: attributes #2 = { readnone }
@@ -467,4 +479,6 @@
 ; CHECK: attributes #42 = { nocallback }
 ; CHECK: attributes #43 = { cold }
 ; CHECK: attributes #44 = { hot }
+; CHECK: attributes #45 = { vscale_range(8,8) }
+; CHECK: attributes #46 = { vscale_range(1,8) }
 ; CHECK: attributes #[[NOBUILTIN]] = { nobuiltin }
Index: llvm/lib/Transforms/Utils/CodeExtractor.cpp
===
--- llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -970,6 +970,7 @@
   case Attribute::StackProtectStrong:
   case Attribute::StrictFP:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NoCfCheck:
   case Attribute::MustProgress:
   case Attribute::NoProfile:
Index: llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
===
--- llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
+++ llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
@@ -73,6 +73,7 @@
   .Case("sspstrong", Attribute::StackProtectStrong)
   .Case("strictfp", Attribute::StrictFP)
   .Case("uwtable", Attribute::UWTable)
+  .Case("vscale_range", Attribute::VScaleRange)
   .Default(Attribute::None);
 }
 
Index: llvm/lib/IR/Verifier.cpp
===
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -1622,6 +1622,7 @@
   case Attribute::InlineHint:
   case Attribute::StackAlignment:
   case Attribute::UWTable:
+  case Attribute::VScaleRange:
   case Attribute::NonLazyBind:
   case Attribute::ReturnsTwice:
   case Attribute::SanitizeAddress:
@@ -1980,6 +1981,14 @@
   return;
   }
 
+  if (Attrs.hasFnAttribute(Attribute::VScaleRange)) {
+std::pair Args =
+Attrs.getVScaleRangeArgs(AttributeList::FunctionIndex);
+
+if (Args.first > Args.second && Args.second != 0)
+  CheckFailed("'vscale_range' minimum cannot be greater than maximum", V);
+  }
+
   if (Attrs.hasFnAttribute("frame-pointer")) {
 StringRef FP = Attrs.getAttribute(AttributeList::FunctionIndex,
   "frame-pointer").getValueAsString();
Index: llvm/lib/IR/Attributes.cpp
===
--- llvm/lib/IR/Attributes.cpp
+++ llvm/lib/IR/Attributes.cpp
@@ -78,6 +78,17 @@
   return std::make_pair(ElemSizeArg, NumElemsArg);
 }
 
+static uint64_t packVScaleRangeArgs(unsigned MinValue, unsigned MaxValue) {
+  return uint64_t(MinValue) << 32 | MaxValue;
+}
+
+static std::pair unpackVScaleRangeArgs(uint64_t Value) {
+  

[PATCH] D1810: [ARM] Fix AArch32 and pre-v8 poly types to be unsigned

2020-12-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith abandoned this revision.
bsmith added a comment.
Herald added subscribers: danielkiss, kristof.beyls.

This change is very old and almost certainly out of date, therefore abandoning.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D1810/new/

https://reviews.llvm.org/D1810

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93206: [AArch64][NEON] Remove undocumented vceqz{,q}_p16, vml{a,s}q_n_f64 intrinsics

2020-12-15 Thread Bradley Smith via Phabricator via cfe-commits
bsmith accepted this revision.
bsmith added a comment.
This revision is now accepted and ready to land.

Changes look good to me, also can confirm these are in fact not part of the 
ACLE specification.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93206/new/

https://reviews.llvm.org/D93206

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits