On 06/03/16 17:22, Evandro Menezes wrote:
On 06/03/16 05:51, Wilco Dijkstra wrote:
It looks almost all AArch64 cores agree on alignment of 16 for function, and 8 for loops and branches, so we should change -mcpu=generic as well if there is no disagreement - feedback welcome.

I'll see what sets of values Exynos M1 would be most comfortable with, but I also wonder if the -falign-labels shouldn't also be a parameter in tune_params.

Thoughts?


FWIW, here are the values for the alignment of functions, branches and loops that fare better on Exynos M1 when -mcpu=generic, in order of preference:

1. 4-4-4
2. 16-4-16
3. 8-4-4

I also controlled the code size and, whenever the branch alignment was 8 or 16 bytes, it would grow quickly, with no proportional improvement to performance on Exynos M1.

HTH

--
Evandro Menezes

Reply via email to