On Thu, Aug 21, 2025 at 11:29:35AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 21, 2025 at 12:26:37AM -0700, Kees Cook wrote:
> > Implement x86_64-specific KCFI backend:
> > 
> > - Function preamble generation with type IDs positioned at -(4+prefix_nops)
> >   offset from function entry point.
> > 
> > - 16-byte alignment of KCFI preambles using calculated prefix NOPs:
> >   aligned(prefix_nops + 5, 16) to maintain cache lines.
> > 
> > - Type-id hash avoids generating ENDBR instruction in type IDs
> >   (0xfa1e0ff3/0xfb1e0ff3 are incremented by 1 to prevent execution).
> > 
> > - On-demand scratch register allocation strategy (r11 as needed).
> >   The clobbers are available both early and late.
> > 
> > - Atomic bundled KCFI check + call/branch sequences using UNSPECV_KCFI
> >   to prevent optimizer separation and maintain security properties.
> > 
> > - Uses the .kcfi_traps section for debugger/runtime metadata.
> > 
> > Assembly Code Pattern layout required by Linux kernel:
> >   movl $inverse_type_id, %r10d  ; Load expected type (0 - hash)
> >   addl offset(%target), %r10d   ; Add stored type ID from preamble
> >   je .Lpass                     ; Branch if types match (sum == 0)
> >   .Ltrap: ud2                   ; Undefined instruction trap on mismatch
> >   .Lpass: call/jmp *%target     ; Execute validated indirect transfer
> > 
> > The initialization of the kcfi callbacks in ix86_option_override()
> > seems like a hack. I couldn't find a better place to do this.
> > 
> > Build and run tested on x86_64 Linux kernel with various CPU errata
> > handling alternatives and FineIBT.
> 
> I'm a little confused, does this force r11 to be the indirect call
> register like clang does? The code seems to suggest it is possible it
> uses another register.
> 
> The current kernel FineIBT code hard assumes r11 for now.

Oh, it looked like it wasn't always r11. Does clang force the call
register to be r11? I only do that here if the call expression isn't a
register (similar to -mindirect-branch-register). Looking at the retpoline
implementation, I see __x86_indirect_thunk_* being generated for all the
general registers. Hm, but in looking now I see all the hard-coded r11 use
in the fineibt alternatives. I wonder if my boot testing is somehow not
triggering the FineIBT alternatives patching? I will investigate more...

-- 
Kees Cook

Reply via email to