https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932
--- Comment #18 from Jim Wilson <wilson at gcc dot gnu.org> --- Ultimately, I believe that this is an ARM backend bug. PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should not behave differently. It would help if the PROMOTE_MODE macro was merged with the TARGET_PROMOTE_FUNCTION_MODE hook, to avoid accidents like this. I've tested my patch to modify PROMOTE_MODE so that it no longer sets UNSIGNEDP for char and short. For the SPEC CPU2000 benchmarks, individual benchmarks are within 1% which is within the noise range, and the full benchmark results have almost identical performance. I get 3 additional failures in the gcc testsuite, for gcc.target/arm/wmul-[123].c. These are testcases to verify generation of the smulbb and smlabb instruction. However, they only work currently because of the extra sign-extends emitted by PROMOTE_MODE. We currently emit an unsigned short load, a sign-extend, and a multiply. The sign-extend gets merged into the multiply. But with the patch, we emit a signed short load and a multiply, and hence can't form smulbb. Unpatched, for wmul-1.c we get ldrh r1, [r4, #2]! ldrh r6, [r0, #2]! smlabb r5, r1, r6, r5 smlabb r2, r1, r1, r2 and patched we get ldrsh r1, [r4, #2]! ldrsh r6, [r0, #2]! mla r5, r1, r6, r5 mla r2, r1, r1, r2 wmul-2.c is similar. There is a bigger difference with wmul-3.c. Unpatched is ldrh r1, [r5, #2]! ldrh r4, [r0, #2]! smulbb r4, r1, r4 subs r6, r6, r4 smulbb r1, r1, r1 subs r2, r2, r1 whereas patched is ldrsh r1, [r4, #2]! ldrsh r6, [r0, #2]! mls r5, r1, r6, r5 mls r2, r1, r1, r2 The patched code is better or equivalent to the unpatched code in all cases, but these testcases no longer serve their purpose. I can fix wmul-1.c by changing types to int and casting to signed short. This doesn't work for wmul-2.c because the scalar sign-extend is moved out of the loop, and no longer available to merge with the multiple. This also doesn't work for wmul-3.c, but only because it is cse'd differently. I get unpatched smulbb r4, r1, r4 subs r6, r6, r4 and patched sxth r4, r4 mls r6, r1, r4, r6 At the moment the only option I have to make wmul-2.c and wmul-3.c work is to remove them