sdesmalen added inline comments.
================ Comment at: clang/include/clang/Basic/arm_sve.td:494 +let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in { + def SVBFDOT : SInst<"svbfdot[_{0}]", "dd$$", "f", MergeNone, "aarch64_sve_bfdot">; + def SVBFMLALB : SInst<"svbfmlalb[_{0}]", "dd$$", "f", MergeNone, "aarch64_sve_bfmlalb">; ---------------- The types for these intrinsics are always `svfloat32_t` and `svbfloat16_t`, which given their semantics is unlikely to ever be extended to other types, so it's easier to make the LLVM IR non-overloaded (i.e. hardcoding `llvm_nxv4f32_ty` and `llvm_nxv8bf16_ty`) and using the `IsOverloadNone` flag for these builtins. Then you can express this builtin as: ```def SVBFDOT: SInst<"svbfdot[_{0}]", "MMdd", "b", MergeNone, "aarch64_sve_bfdot">;``` and drop the need for the `$` modifier. ================ Comment at: clang/include/clang/Basic/arm_sve.td:498 + def SVBFMMLA : SInst<"svbfmmla[_{0}]", "dd$$", "f", MergeNone, "aarch64_sve_bfmmla">; + def SVBFDOT_N : SInst<"svbfdot[_n_{0}]", "dd$~", "f", MergeNone, "aarch64_sve_bfdot">; + def SVBFMLAL_N : SInst<"svbfmlalb[_n_{0}]", "dd$~", "f", MergeNone, "aarch64_sve_bfmlalb">; ---------------- Similar to the suggestion above to use `"MMdd"` for SVBFDOT, this could use `"MMda"` and you don't need the `~` modifier. nit: add whitespace above this line. nit: the rest of this file tries to align the columns, that makes this file a bit easier to read. ================ Comment at: clang/include/clang/Basic/arm_sve.td:1032 + defm SVCVT_BF16_F32 : SInstCvtMXZ<"svcvt_bf16[_f32]", "ddPM", "dPM", "b", "aarch64_sve_cvt_bf16f32">; + // svcvtnt_bf16_f32 + defm SVCVTNT_BF16_F32 : SInstCvtMX<"svcvtnt_bf16[_f32]", "ddPM", "dPM", "b", "aarch64_sve_cvtnt_bf16f32">; ---------------- nit: redundant comment (same for above) ================ Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c:27 + +svfloat32_t test_bfdot_lane_1_f32(svfloat32_t x, svbfloat16_t y, svbfloat16_t z) { + // CHECK-LABEL: @test_bfdot_lane_1_f32( ---------------- Testing the edge cases 0 and 3 should be sufficient. (same for all other cases in this patch) ================ Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1343 +class SVE_bfloat + : Intrinsic<[llvm_anyvector_ty], ---------------- nit: `SVE_bfloat` is not very descriptive, maybe use `SVE_4Vec_BF16` and `SVE_4Vec_BF16_Indexed`? ================ Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1811 +def int_aarch64_sve_cvt_bf16f32 : Builtin_SVCVT<"svcvt_bf16_f32_m", llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>; +def int_aarch64_sve_cvtnt_bf16f32 : Builtin_SVCVT<"svcvtnt_bf16_f32_m", llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>; ---------------- nit: use `fcvtbf` instead of `cvt` => `int_aarch64_sve_fcvtbf_bf16f32` ? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82141/new/ https://reviews.llvm.org/D82141 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits