On 18/12/2025 17:51, Alice Carlotti wrote:
On Thu, Dec 18, 2025 at 05:14:50PM +0000, Claudio Bantaloukas wrote:
This patch series completes support for SME2 and SME2p1 intrinsics relative to
modal 8bit floating point types.
- The first patch in the series introduces tests for using luti intrinsics with
mf8 that was already working since their introduction, now that their use is
documented in ACLE.
- The second patch extends the definitions of existing non-interpreting sve2/sme
intrinsics to support mfloat8 types.
- The third and fourth patches add widening and narrowing sme2 fp8 conversions
respectively (svcvt).
- The fifth patch adds multi-vector floating-point adjust exponent intrinsics
(svscale).
- The sixth patch adds support for the sme-f8f16 and sme-f8f32 arch features
and related defines.
- Patch 7 adds Multi-vector 8-bit floating-point multiply-add long intrinsics.
- Patch 8 adds 8-bit floating-point sum of outer products and accumulate
intrinsics.
- Patch 9 adds 8-bit floating point dot product intrinsics.
This is going to be awkward to implement, but I think we also need to make the
existing FEAT_SME_F16F16 add/sub intrinsics available when +sme-f8f16 is
enabled (without +sme-f16f16). That is, the feature requirements need updating
for:
DEF_SME_ZA_FUNCTION_GS (svadd, unary_za_slice, za_d_float, vg1x24, none)
DEF_SME_ZA_FUNCTION_GS (svsub, unary_za_slice, za_d_float, vg1x24, none)
I think I see what you mean. The conditions guarding opcode availability
of Two ZA single-vectors of half-precision elements (FEAT_SME_F16F16 ||
FEAT_SME_F8F16) and Four ZA single-vectors of half-precision elements
(FEAT_SME_F16F16 || FEAT_SME_F8F16) do not match the definitions in ACLE.
For reference
https://developer.arm.com/documentation/ddi0602/2025-12/SME-Instructions/FADD--Multi-vector-floating-point-accumulate-to-ZA-array-vectors-?lang=en
And the relevant acle line
https://github.com/ARM-software/acle/blob/1622440cc8930729c2f014555b996e8c073553e9/main/acle.md?plain=1#L11274
For now I think we should add the change as is and drive an ACLE change
and subsequently update gcc.
Alice
Thanks for reviewing!
Cheers,
Claudio