[Bug target/111867] New: aarch64: Wrong code for bf16 literal load when the arch support +fp16

iains at gcc dot gnu.org via Gcc-bugs Wed, 18 Oct 2023 11:16:22 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111867


            Bug ID: 111867
           Summary: aarch64: Wrong code for bf16 literal load when the
                    arch support +fp16
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: iains at gcc dot gnu.org
  Target Milestone: ---

Analysing some target fails for aarch64-darwin.

The base arch for Apple M1 is (as far as I can determine)
armv8.4-a+fp16+sb+ssbs
So it has fp16 and fp15fml 0 but not bf16.

(M2 does have bf16).

===

int main ()
{
  __bf16 a = 1.0bf16;
  return (int) (a + a);
}

=== with the arch flags above produces:
_main:
 <snip>
        fmov    h31, 1.0e+0
        str     h31, [x29, 46]
        ldr     h15, [x29, 46]
        mov     v0.h[0], v15.h[0]
        bl      ___extendbfsf2
        fmov    s14, s0
<snip>

Which seems to be loading an __fp16 value into h31 (not a __bf16 value)
not surprisingly this fails.


I checked the instruction bit pattern with objdump, and it is 0x1eee101f, which
is clearly a fp16 load.


==== with pruned arch flags armv8.4-a

_main:
 <snip>
        adrp    x0, lC0@PAGE
        ldr     h31, [x0, #lC0@PAGEOFF]
        str     h31, [x29, 30]
        ldr     h0, [x29, 30]
        bl      ___extendbfsf2
<snip>


lC0:
        .hword  16256

which looks correct (and produces the expected answer).
----
So support for fp16 seems to be breaking soft __bf16.

[Bug target/111867] New: aarch64: Wrong code for bf16 literal load when the arch support +fp16

Reply via email to