On 18/12/2025 17:51, Alice Carlotti wrote:
On Thu, Dec 18, 2025 at 05:14:50PM +0000, Claudio Bantaloukas wrote:

This patch series completes support for SME2 and SME2p1 intrinsics relative to
modal 8bit floating point types.

- The first patch in the series introduces tests for using luti intrinsics with
   mf8 that was already working since their introduction, now that their use is
   documented in ACLE.
- The second patch extends the definitions of existing non-interpreting sve2/sme
   intrinsics to support mfloat8 types.
- The third and fourth patches add widening and narrowing sme2 fp8 conversions
   respectively (svcvt).
- The fifth patch adds multi-vector floating-point adjust exponent intrinsics
   (svscale).
- The sixth patch adds support for the sme-f8f16 and sme-f8f32 arch features
   and related defines.
- Patch 7 adds Multi-vector 8-bit floating-point multiply-add long intrinsics.
- Patch 8 adds 8-bit floating-point sum of outer products and accumulate
   intrinsics.
- Patch 9 adds 8-bit floating point dot product intrinsics.

This is going to be awkward to implement, but I think we also need to make the
existing FEAT_SME_F16F16 add/sub intrinsics available when +sme-f8f16 is
enabled (without +sme-f16f16).  That is, the feature requirements need updating
for:

DEF_SME_ZA_FUNCTION_GS (svadd, unary_za_slice, za_d_float, vg1x24, none)
DEF_SME_ZA_FUNCTION_GS (svsub, unary_za_slice, za_d_float, vg1x24, none)

I think I see what you mean. The conditions guarding opcode availability of Two ZA single-vectors of half-precision elements (FEAT_SME_F16F16 || FEAT_SME_F8F16) and Four ZA single-vectors of half-precision elements
(FEAT_SME_F16F16 || FEAT_SME_F8F16) do not match the definitions in ACLE.

For reference https://developer.arm.com/documentation/ddi0602/2025-12/SME-Instructions/FADD--Multi-vector-floating-point-accumulate-to-ZA-array-vectors-?lang=en And the relevant acle line https://github.com/ARM-software/acle/blob/1622440cc8930729c2f014555b996e8c073553e9/main/acle.md?plain=1#L11274

For now I think we should add the change as is and drive an ACLE change and subsequently update gcc.

Alice

Thanks for reviewing!
Cheers,
Claudio

Reply via email to