https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

            Bug ID: 102952
           Summary: New code-gen options for retpolines and straight line
                    speculation
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andrew.cooper3 at citrix dot com
  Target Milestone: ---

Hello

[FYI, this is being cross-requested of Clang too]

Linux and other kernel level software makes use of
-mindirect-branch=thunk-extern to be able to alter the handling of indirect
branches at boot.  It turns out to be advantageous to inline the thunks when
retpoline is not in use. 
https://lore.kernel.org/lkml/20211026120132.613201...@infradead.org/ is some
infrastructure to make this work.

In some cases, we want to be able to inline an `lfence; jmp *%reg` thunk.  This
is fine for the low 8 registers, but not fine for %r{8..15} where the REX
prefix pushes the replacement size to being 6 bytes.

It would be very useful to have a code-gen option to write out `call
%cs:__x86_indirect_thunk_r{8..15}` where the redundant %cs prefix will increase
the instruction length to 6, allowing the non-retpoline form to be inlined.


Relatedly, x86 straight line speculation has been discussed before, but without
any action taken.  It would be helpful to have a code gen option which would
emit `int3` following any `ret` instruction, and any indirect jump, as neither
of these two cases have following architectural execution.

The reason these two are related is that if both options are in use, we want an
extra byte of replacement space to be able to inline `lfence; jmp *%reg; int3`.


Third (and possibly only for future optimisations), Clang has been observed to
spot conditional tail calls as `Jcc __x86_indirect_thunk_*`.  This is a 6 byte
source size, but needs up to 9 bytes of space for inlining including an `int3`
for straight line speculation reasons (See
https://lore.kernel.org/lkml/20211026120310.359986...@infradead.org/ for full
details).  It might be enough to simply prohibit an optimisation like this when
trying to pad retpolines for inlineability.

Reply via email to