https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932

--- Comment #18 from Jim Wilson <wilson at gcc dot gnu.org> ---
Ultimately, I believe that this is an ARM backend bug.  PROMOTE_MODE and
TARGET_PROMOTE_FUNCTION_MODE should not behave differently.  It would help if
the PROMOTE_MODE macro was merged with the TARGET_PROMOTE_FUNCTION_MODE hook,
to avoid accidents like this.

I've tested my patch to modify PROMOTE_MODE so that it no longer sets UNSIGNEDP
for char and short.

For the SPEC CPU2000 benchmarks, individual benchmarks are within 1% which is
within the noise range, and the full benchmark results have almost identical
performance.

I get 3 additional failures in the gcc testsuite, for
gcc.target/arm/wmul-[123].c.  These are testcases to verify generation of the
smulbb and smlabb instruction.  However, they only work currently because of
the extra sign-extends emitted by PROMOTE_MODE.  We currently emit an unsigned
short load, a sign-extend, and a multiply.  The sign-extend gets merged into
the multiply.  But with the patch, we emit a signed short load and a multiply,
and hence can't form smulbb.  Unpatched, for wmul-1.c we get
        ldrh    r1, [r4, #2]!
        ldrh    r6, [r0, #2]!
        smlabb  r5, r1, r6, r5
        smlabb  r2, r1, r1, r2
and patched we get
        ldrsh   r1, [r4, #2]!
        ldrsh   r6, [r0, #2]!
        mla     r5, r1, r6, r5
        mla     r2, r1, r1, r2
wmul-2.c is similar.  There is a bigger difference with wmul-3.c.  Unpatched is
        ldrh    r1, [r5, #2]!
        ldrh    r4, [r0, #2]!
        smulbb  r4, r1, r4
        subs    r6, r6, r4
        smulbb  r1, r1, r1
        subs    r2, r2, r1
whereas patched is
        ldrsh   r1, [r4, #2]!
        ldrsh   r6, [r0, #2]!
        mls     r5, r1, r6, r5
        mls     r2, r1, r1, r2
The patched code is better or equivalent to the unpatched code in all cases,
but these testcases no longer serve their purpose.  I can fix wmul-1.c by
changing types to int and casting to signed short.  This doesn't work for
wmul-2.c because the scalar sign-extend is moved out of the loop, and no longer
available to merge with the multiple.  This also doesn't work for wmul-3.c, but
only because it is cse'd differently.  I get unpatched
        smulbb  r4, r1, r4
        subs    r6, r6, r4
and patched
        sxth    r4, r4
        mls     r6, r1, r4, r6

At the moment the only option I have to make wmul-2.c and wmul-3.c work is to
remove them

Reply via email to