[Bug c/107048] New: GCC lacks -fsanitize=kcfi

2022-09-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048

Bug ID: 107048
   Summary: GCC lacks -fsanitize=kcfi
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peterz at infradead dot org
  Target Milestone: ---

Please implement -fsanitize=kcfi to match llvm/clang:

 
https://github.com/samitolvanen/llvm-project/commit/f7bf6a87c4fd945800115a17b8b61390541fabd0

The Linux kernel patches are queued and slated for the next merge window.
Things like FineIBT rely on having this feature enabled.

[Bug target/104816] -fcf-protection=branch should generate endbr instead of notrack jumps

2022-05-24 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816

--- Comment #12 from peterz at infradead dot org ---
On Tue, May 24, 2022 at 04:06:08PM +, cvs-commit at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816
> 
> --- Comment #11 from CVS Commits  ---
> The master branch has been updated by H.J. Lu :
> 
> https://gcc.gnu.org/g:2f4f7de787e5844515d27b2269fc472f95a9916a
> 
> commit r13-744-g2f4f7de787e5844515d27b2269fc472f95a9916a
> Author: H.J. Lu 
> Date:   Fri Mar 11 12:51:34 2022 -0800
> 
> x86: Document -mcet-switch
> 
> When -fcf-protection=branch is used, the compiler will generate jump
> tables for switch statements where the indirect jump is prefixed with
> the NOTRACK prefix, so it can jump to non-ENDBR targets.  Since the
> indirect jump targets are generated by the compiler and stored in
> read-only memory, this does not result in a direct loss of hardening.
> But if the jump table index is attacker-controlled, the indirect jump
> may not be constrained by CET.

Notrack indirect jumps are fully susceptible to speculation attacks.

[Bug ipa/105160] [12 regression] ipa modref marks functions with asm volatile as const or pure

2022-04-06 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160

--- Comment #7 from peterz at infradead dot org ---
(In reply to peterz from comment #6)
> Happy accident; I've been wanting to allow doing something like:
> 
> static __always_inline __pure bool arch_static_branch(struct static_key *
> const key)
> {
> asm_volatile_goto("1:"
>   ".byte " __stringify(BYTES_NOP5) "\n\t"
>   ".pushsection __jump_table, \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   ".long 1b - ., %l[l_yes] - . \n\t"
>   _ASM_PTR "%c0 - . \n\t"
>   ".popsecion \n\t"
>   : : "i" (key) : : l_yes);
> return false;
> l_yes:
> return true;
> }
> 
> 
> Now, since asm-goto is implicitly volatile, and asm volatile (per this
> thread) destroys __pure due to side-effects, this doesn't actually work. But
> I'd like to still have the explicit pure attribute override this. If the
> programmer gets it wrong like this, he/she gets to keep the pieces.
> 
> Specifically in this case, he result of the function really only depends on
> the @key argument. Any call to this will have the exact same outcome and
> merging multiple instances is *good*.
> 
> Also see this thread on linux-toolchains:
> 
>  
> https://lore.kernel.org/all/YG80wg%2F2iZjXfCDJ@hirez.programming.kicks-ass.
> net/

Possible test case here:

  https://lore.kernel.org/all/yib0vyqmes9hr...@hirez.programming.kicks-ass.net/

[Bug ipa/105160] [12 regression] ipa modref marks functions with asm volatile as const or pure

2022-04-06 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160

peterz at infradead dot org changed:

   What|Removed |Added

 CC||peterz at infradead dot org

--- Comment #6 from peterz at infradead dot org ---
Happy accident; I've been wanting to allow doing something like:

static __always_inline __pure bool arch_static_branch(struct static_key * const
key)
{
asm_volatile_goto("1:"
  ".byte " __stringify(BYTES_NOP5) "\n\t"
  ".pushsection __jump_table, \"aw\" \n\t"
  _ASM_ALIGN "\n\t"
  ".long 1b - ., %l[l_yes] - . \n\t"
  _ASM_PTR "%c0 - . \n\t"
  ".popsecion \n\t"
  : : "i" (key) : : l_yes);
return false;
l_yes:
return true;
}


Now, since asm-goto is implicitly volatile, and asm volatile (per this thread)
destroys __pure due to side-effects, this doesn't actually work. But I'd like
to still have the explicit pure attribute override this. If the programmer gets
it wrong like this, he/she gets to keep the pieces.

Specifically in this case, he result of the function really only depends on the
@key argument. Any call to this will have the exact same outcome and merging
multiple instances is *good*.

Also see this thread on linux-toolchains:

 
https://lore.kernel.org/all/yg80wg%2f2izjxf...@hirez.programming.kicks-ass.net/

[Bug target/104816] -fcf-protection=branch should generate endbr instead of notrack jumps

2022-03-07 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816

--- Comment #6 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #5)
> (In reply to Andrew Cooper from comment #4)
> > I've worked around this in Xen with:
> > https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;
> > h=9d4a44380d273de22d5753883cbf5581795ff24d and 
> > https://lore.kernel.org/lkml/YiXpv0q88paPHPqF@hirez.programming.kicks-ass.
> > net/ is pending for Linux.
> > 
> > IMO, it's an error that -fcf-protection=branch is not obeyed for jump
> > tables, and we don't want to end up in a situation where jump tables are
> > unusable with CET.
> 
> Are you suggesting to add an option to generate jump table with ENDBR?

I would suggest having -fcf-protection=branch generate ENDBR for jump-tables
and never generate NOTRACK prefix. Then add a mode that allows NOTRACK
prefixes, perhaps -fcf-protection=branch,notrack.

IBT without NOTRACK is the strongest form; it would be daft to require
additional parameters for that.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-11-16 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #25 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #24)
> Should I submit the current patches?

Yes, I'd say so. Once merged I'll send a kernel patch to use
-mindirect-branch-cs-prefix for all RETPOLINE builds. I posted an SLS patch but
haven't really had much feedback on that yet, we'll see.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-28 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #21 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #19)
> Created attachment 51685 [details]
> The v4 patch to add -mharden-sls=

I seem to have found one 'funny':

kernel/bpf/core.o: warning: objtool: ___bpf_prog_run()+0x23: unreachable
instruction

  27e6:   48 8b 14 d5 00 00 00mov0x0(,%rdx,8),%rdx
  27ed:   00
   27ea: R_X86_64_32S  .rodata..c_jump_table
  27ee:   e9 00 00 00 00  jmp27f3 <___bpf_prog_run+0x23>
   27ef: R_X86_64_PLT32__x86_indirect_thunk_rdx-0x4
  27f3:   cc  int3
  27f4:   cc  int3
  27f5:   8b 53 04mov0x4(%rbx),%edx


Other than that, it seems I need to go clean up our asm ...

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-28 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #20 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #19)
> Created attachment 51685 [details]
> The v4 patch to add -mharden-sls=

That looks to do the right thing! Let me go write more validation stuff to
double check. Thanks!

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #13 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #12)
> (In reply to peterz from comment #9)
> > Created attachment 51683 [details]
> > kernel patch to test -mharden-sls=all
> > 
> > $ make O=defconfig CC=gcc-12.0.0 arch/x86/entry/common.o
> > ...
> > arch/x86/entry/common.o: warning: objtool: do_SYSENTER_32()+0x1b:
> > unreachable instruction
> 
> Please try the v2 patch.

Per comment #6 this should be v3, no? Anyway, the good news is that I now seem
to have a kernel image with lots of extra int3 instructions, but all in the
right place.

*However*, I seem to be missing a few:

  36f4:   41 5f   pop%r15
  36f6:   e9 00 00 00 00  jmp36fb <__do_set_cpus_allowed+0x5b>
36f7: R_X86_64_PLT32__x86_indirect_thunk_rax-0x4
  36fb:   48 8b 87 90 02 00 00mov0x290(%rdi),%rax

There should be one after the jmp __x86_indirect_thunk_* thingy. I'll do an
objtool patch to search for missing int3, but that'll have to wait until
tomorrow, it's past midnight.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #10 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #4)
> Created attachment 51679 [details]
> A patch to add -mindirect-branch-cs-prefix
> 
> It adds CS prefix to call and jmp to thunk when converting indirect call
> and jump.

This seems to work as expected, I get CS prefix on all high reg thunk calls and
no CS prefix on the low reg calls. Example:

29241:   e8 00 00 00 00  call   29246 
 29242: R_X86_64_PLT32   __x86_indirect_thunk_rax-0x4
292c8:   2e e8 00 00 00 00   cs call 292ce
 292ca: R_X86_64_PLT32  
__x86_indirect_thunk_r10-0x4

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #9 from peterz at infradead dot org ---
Created attachment 51683
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51683=edit
kernel patch to test -mharden-sls=all

$ make O=defconfig CC=gcc-12.0.0 arch/x86/entry/common.o
...
arch/x86/entry/common.o: warning: objtool: do_SYSENTER_32()+0x1b: unreachable
instruction

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

peterz at infradead dot org changed:

   What|Removed |Added

 CC||peterz at infradead dot org

--- Comment #7 from peterz at infradead dot org ---
(In reply to H.J. Lu from comment #3)
> Created attachment 51678 [details]
> A patch to add -mharden-sls=
> 
> x86: Add -mharden-sls=[none|all|return|indirect-branch]
> 
> Generate code to mitigate against straight line speculation.

I'm getting (a lot) spurious int3 instructions with this, for example:

0280 :
 280:   48 81 8f 90 00 00 00 00 02 00 00orq$0x200,0x90(%rdi)
 28b:   48 8b 47 20 mov0x20(%rdi),%rax
 28f:   48 89 87 98 00 00 00mov%rax,0x98(%rdi)
 296:   e9 75 ff ff ff  jmp210 
 29b:   cc  int3

That's not an *indirect* jump there.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-27 Thread peterz at infradead dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

peterz at infradead dot org changed:

   What|Removed |Added

 CC||peterz at infradead dot org

--- Comment #5 from peterz at infradead dot org ---

(In reply to H.J. Lu from comment #1)
> (In reply to Andrew Cooper from comment #0)
> > 
> > Finally, one minor code generation improvement.  When GCC emits a direct
> > call/jmp to an ENDBR'd symbol, it can actually use sym+4 as an optimisation
> > to skip the ENDBR instruction (not needed for direct call/jmp's) and save on
> > decode bandwidth.
> 
> It is only safe for calling a static function whose address has been taken.

Could be done wider using LTO, or pushed into the linker if you're willing to
change the ELF format and augment it with STT_FUNC_BTI and R_*_BTI32, there the
new relocation would mean +4 (or rather, skip landing pad) when aimed at
STT_FUNC_BTI and be identical to _PC32 otherwise.

The ELF thing doesn't help with reducing ENDBR emissions for global symbols
since we can't ever tell who will take the address, but it will help with
directly calling avoiding the landing pad.

It also allows for a 'pseudo' LTO thing to 'seal' an executable, where it will
strip the ENDBRs for all symbols that do not have a _PC32 rela. We can use
EXPORT_SYMBOL*() to generate one such on purpose to avoid exported functions
from being sealed.

Anyway, I think there's much value in reducing the number of ENDBR instructions
as much as possible and we should investigate how the toolchains can best help
there.