https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102018

Torbjorn SVENSSON <azoff at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |azoff at gcc dot gnu.org

--- Comment #3 from Torbjorn SVENSSON <azoff at gcc dot gnu.org> ---
The reason for the failure (AFAICT), is due to that vcmpe.f64 is used for -O1,
-O2 and -O3 (-Ofast also does the same, but there is no test with it). For -O0
and -Os, vcmp.f64 is instead used.

Assembly for -O1:
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        vcmpe.f64       d0, #0  @ 43    [c=4 l=4]  *cmpdf_trap_vfp/1
        vmrs    APSR_nzcv, FPSCR        @ 44    [c=4 l=4]  *movcc_vfp
        bls     .L5             @ 11    [c=16 l=2]  arm_cond_branch
        vmov.f64        d7, #1.0e+0     @ 45    [c=4 l=4]  *thumb2_movdf_vfp/2
        vcmp.f64        d0, d7  @ 41    [c=4 l=4]  *cmpdf_vfp/0
        vmrs    APSR_nzcv, FPSCR        @ 42    [c=4 l=4]  *movcc_vfp
        bgt     .L5             @ 18    [c=16 l=2]  arm_cond_branch
        vmul.f64        d0, d0, d0      @ 26    [c=24 l=4]  *muldf3_vfp
        bx      lr      @ 49    [c=8 l=4]  *thumb2_return
.L5:
        vadd.f64        d0, d0, d0      @ 21    [c=16 l=4]  *adddf3_vfp
        bx      lr      @ 39    [c=8 l=4]  *thumb2_return



Assembly for -Os:
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        vcmp.f64        d0, #0  @ 64    [c=4 l=4]  *cmpdf_vfp/1
        vmrs    APSR_nzcv, FPSCR        @ 65    [c=4 l=4]  *movcc_vfp
        bls     .L2             @ 8     [c=16 l=2]  arm_cond_branch
        vmov.f64        d7, #1.0e+0     @ 13    [c=4 l=4]  *thumb2_movdf_vfp/2
        vcmp.f64        d0, d7  @ 62    [c=4 l=4]  *cmpdf_vfp/0
        vmrs    APSR_nzcv, FPSCR        @ 63    [c=4 l=4]  *movcc_vfp
        ble     .L4             @ 15    [c=16 l=2]  arm_cond_branch
.L2:
        vadd.f64        d0, d0, d0      @ 18    [c=4 l=4]  *adddf3_vfp
        bx      lr      @ 60    [c=8 l=4]  *thumb2_return
.L4:
        vmul.f64        d0, d0, d0      @ 23    [c=4 l=4]  *muldf3_vfp
        bx      lr      @ 69    [c=8 l=4]  *thumb2_return

The above was extracted from compiling using:
arm-none-eabi-gcc pr82692.c -mthumb -march=armv7e-m+fp.dp -mtune=cortex-m7
-mfloat-abi=hard -mfpu=auto -S -o - -Os

This bug is only present if -mtune=cortex-m7 or -mcpu=cortex-m7 is used. I
suppose it has something to do with the cost model for Cortex-M7 as otherwise,
Cortex-M4 would likely be affected too.

Reply via email to