https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122871

--- Comment #12 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
(In reply to Torbjorn SVENSSON from comment #11)

> The new test case fail for Cortex-M0 and Cortex-M23. Is this a thumb2-only
> improvement?

In principle the optimization is valid for thumb1 cores, but since we lack
shift+add patterns there, we should end up with something like

lsls     r2, r0, #1
adds     r1, r1, r2
bx       lr

That's still much better than the sequence using adc(s), but obviously not
quite as simple as a single shift+add pattern.

And obviously this won't match the expected output in the current testcase.

The current test should probably add
/* { dg-require-effective-target arm32 } */

And then create a separate test for thumb1 targets.

If the thumb1 code generator isn't generating something like the above sequence
we should create a new PR for that as it's likely a costing issue in the
backend.

Reply via email to