https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111867
Bug ID: 111867 Summary: aarch64: Wrong code for bf16 literal load when the arch support +fp16 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: iains at gcc dot gnu.org Target Milestone: --- Analysing some target fails for aarch64-darwin. The base arch for Apple M1 is (as far as I can determine) armv8.4-a+fp16+sb+ssbs So it has fp16 and fp15fml 0 but not bf16. (M2 does have bf16). === int main () { __bf16 a = 1.0bf16; return (int) (a + a); } === with the arch flags above produces: _main: <snip> fmov h31, 1.0e+0 str h31, [x29, 46] ldr h15, [x29, 46] mov v0.h[0], v15.h[0] bl ___extendbfsf2 fmov s14, s0 <snip> Which seems to be loading an __fp16 value into h31 (not a __bf16 value) not surprisingly this fails. I checked the instruction bit pattern with objdump, and it is 0x1eee101f, which is clearly a fp16 load. ==== with pruned arch flags armv8.4-a _main: <snip> adrp x0, lC0@PAGE ldr h31, [x0, #lC0@PAGEOFF] str h31, [x29, 30] ldr h0, [x29, 30] bl ___extendbfsf2 <snip> lC0: .hword 16256 which looks correct (and produces the expected answer). ---- So support for fp16 seems to be breaking soft __bf16.