Re: [PATCH 2/2] Aarch64: Add branch diluter pass

Andrea Corallo Wed, 22 Jul 2020 12:45:37 -0700

Segher Boessenkool <seg...@kernel.crashing.org> writes:

> Hi!
>
> On Wed, Jul 22, 2020 at 03:53:34PM +0200, Andrea Corallo wrote:
>> Andrew Pinski <pins...@gmail.com> writes:
>> > Can you give a simple example of what this patch does?
>>
>> Sure, this pass simply moves a sliding window over the insns trying to
>> make sure that we never have more then 'max_branch' branches for every
>> 'granule_size' insns.
>>
>> If too many branches are detected nops are added where considered less
>> armful to correct that.
>
> Should that actually be a sliding window, or should there actually just
> not be more than N branches per aligned block of machine code?  Like,
> per fetch group.
>
> Can you not use ASM_OUTPUT_ALIGN_WITH_NOP (or ASM_OUTPUT_MAX_SKIP_ALIGN
> even) then?  GCC has infrastructure for that, already.


Correct, it's a sliding window only because the real load address is not
known to the compiler and the algorithm is conservative.  I believe we
could use ASM_OUTPUT_ALIGN_WITH_NOP if we align each function to (al
least) the granule size, then we should be able to insert 'nop aligned
labels' precisely.

My main fear is that given new cores tend to have big granules code size
would blow.  One advantage of the implemented algorithm is that even if
slightly conservative it's impacting code size only where an high branch
density shows up.

  Andrea

Re: [PATCH 2/2] Aarch64: Add branch diluter pass

Reply via email to