| Issue |
114987
|
| Summary |
AArch64 target unconditionally generates SVE instructions for Armv9-A despite them being optional in the architecture
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
willdeacon
|
Hi,
As of [this commit](https://github.com/llvm/llvm-project/commit/3550e242fad672696da361f7ddadf53a41114dfd), specifying an Armv9-A architecture will cause Clang to generate SVE instructions unconditionally. However, these instructions are `OPTIONAL` from version v8.2 of the architecture, as called out in the [Arm ARM](https://developer.arm.com/documentation/ddi0487/ka/?lang=en):
> // ARM DDI 0487K.a, A2-105
> FEAT_SVE is OPTIONAL from Armv8.2.
This is particularly problematic when running in a KVM guest environment, as SVE is disabled by default regardless of the underlying hardware capabilities and must be explicitly enabled by the VMM as an [opt-in](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virt/kvm/api.rst#n3513) vCPU feature. Consequently, host binaries compiled with `-march=armv9-a` cannot execute in guest context on a v9 CPU unless the VMM enables SVE. Of course, these binaries would also fail to execute on a v9 CPU that chose not to implement SVE at all, but the KVM case is what we have run into in Android.
In addition to the above, there is a misleading "note" in the Arm ARM about SVE2 (which implies SVE) specifically:
> // ARM DDI 0487K.a, A1-59
> Note:
> All Armv8-A systems that support standard operating systems with rich application environments provide hardware
support for Advanced SIMD and floating-point instructions. **All Armv9-A systems that support standard operating
systems with rich application environments also provide hardware support for SVE2 instructions.** It is a requirement
of the ARM Procedure Call Standard for AArch64, see Procedure Call Standard for the Arm 64-bit Architecture.
It's all very fluffy (who knows what a "rich application environment" really means), but the final sentence gives the wrong impression that the [PCS](https://github.com/ARM-software/abi-aa/blob/a82eef0433556b30539c0d4463768d9feb8cfd0b/aapcs64/aapcs64.rst) requires support for SVE2. Although the PCS does require hardware support for fpsimd (see [this footnote](https://github.com/ARM-software/abi-aa/blob/a82eef0433556b30539c0d4463768d9feb8cfd0b/aapcs64/aapcs64.rst#aapcs64-f1)), SVE is still correctly referred to as an [optional extension](https://github.com/ARM-software/abi-aa/blob/a82eef0433556b30539c0d4463768d9feb8cfd0b/aapcs64/aapcs64.rst#12appendix-support-for-scalable-vectors).
Looking back at an older version of the Arm ARM:
> // ARM DDI 0487E.a, A1-51
> Note:
> All systems that support standard operating systems with rich application environments provide hardware
support for Advanced SIMD and floating-point. It is a requirement of the ARM Procedure Call Standard for
AArch64, see Procedure Call Standard for the Arm 64-bit Architecture.
It seems plausible that the SVE2 text was shoe-horned in a little clumsily and the implication on the PCS was accidental.
Anyway, the tl;dr is that I don't think specifying an Armv9-A target architecture should assume the presence of SVE as this is not guaranteed by the CPU architecture and doesn't match the default behaviour of KVM. Instead, I think SVE should be specified explicitly as e.g. `armv9-a+sve` on the assumption that the user knows that they are generating non-portable binaries.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs