Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On 02/28/2018 02:39 AM, Steve Beattie wrote: > Hi Jeff, > > On Thu, Feb 22, 2018 at 10:10:13AM -0700, Jeff Law wrote: >> A few notes. >> >> 1. It's not even clear at this time that retpolining user space binaries >> makes any sense at all. SO before doing anything to make this easier >> I'd like to see a justification for why it's really needed. > > Do you have a reference that gives evidence that retpolining user > space is not needed or not preferred for x86? Everything that I've > seen has suggested user space to user space attacks are possible, > if difficult. And it does not seem likely that microcode updates will > occur for all processor generations out there. No reference, but that's general conclusion we've come to internally with input from Intel. Essentially you use the newly exposed capabilities from the microcode updates to flush the state of the indirect predictor at context switch time. Combine that with the microcode change to stop sharing indirect branch prediction state between SMT threads. I'm being a bit imprecise, but that's the gist. > >> 2. On the other hand, the existing thunk options do make it easier to >> test independent of hte kernel. ie, I can turn on inline thunks by >> default and test things in user space (ie, do thunks generally work >> properly). > > If thunk-extern is to be the only maintained option, and its deemed > sensible for user space in at least some situations, is there a > preferred location for the thunks to end up? Ideally you'd have one set of thunks per DSO. Otherwise you end up doing an indirect branch for the cross-DSO call to get to the thunk which largely defeats the purpose of the thunks to begin with. > > (I ask these questions because you can already find individual users > recompiling apps important to them with retpoline options, and there > is pressure (with associated deadlines) in some quarters to rebuild > vast tracts of user space with retpolines for x86.) I know. I've given up trying to educate all of them. I expected all along that well intentioned, but ultimately clueless, folks would start doing this kind of thing. jeff signature.asc Description: OpenPGP digital signature
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
Hi Jeff, On Thu, Feb 22, 2018 at 10:10:13AM -0700, Jeff Law wrote: > A few notes. > > 1. It's not even clear at this time that retpolining user space binaries > makes any sense at all. SO before doing anything to make this easier > I'd like to see a justification for why it's really needed. Do you have a reference that gives evidence that retpolining user space is not needed or not preferred for x86? Everything that I've seen has suggested user space to user space attacks are possible, if difficult. And it does not seem likely that microcode updates will occur for all processor generations out there. > 2. On the other hand, the existing thunk options do make it easier to > test independent of hte kernel. ie, I can turn on inline thunks by > default and test things in user space (ie, do thunks generally work > properly). If thunk-extern is to be the only maintained option, and its deemed sensible for user space in at least some situations, is there a preferred location for the thunks to end up? (I ask these questions because you can already find individual users recompiling apps important to them with retpoline options, and there is pressure (with associated deadlines) in some quarters to rebuild vast tracts of user space with retpolines for x86.) Thanks. -- Steve Beattiehttp://NxNW.org/~steve/ signature.asc Description: PGP signature
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
> On Mon, Feb 26, 2018 at 8:54 AM, Jan Hubickawrote: > >> On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubicka wrote: > >> > Hi, > >> > my main concern about the patch is that we now have > >> > -mindirect-branch=thunk-extern > >> > which is intended to work well and is used by kernel, but we also have > >> > other modes > >> > that are documented and as such they should work but they may lead to > >> > invalid > >> > unwind info (or did I miss anything imporant here?). > >> > Why we can't fix the others as well? > >> > > >> > >> I took a closer look at my commit message. It does leave an impression > >> that > >> only -mindirect-branch=thunk-extern is fixed. But in fact, all > >> -mindirect-branch= > >> choices are fixed. > > > > I see, sorry for confussion! > >> > >> 1. -mindirect-branch=thunk generates: > >> > >>.cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> movq16(%rax), %rax > >> call__x86_indirect_thunk_rax > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> jmp __x86_indirect_thunk_rax > >> .cfi_endproc > >> > >> 2. -mindirect-branch=thunk-inline generates: > >> > >>.cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> movq16(%rax), %rax > >> jmp .LIND1 > >> .LIND0: > >> call.LIND3 > >> .LIND2: > >> pause > >> lfence > >> jmp .LIND2 > >> .LIND3: > >> mov %rax, (%rsp) > >> ret > >> .LIND1: > >> call.LIND0 > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> call.LIND5 > >> .LIND4: > >> pause > >> lfence > >> jmp .LIND4 > >> .LIND5: > >> mov %rax, (%rsp) > >> ret > >> .cfi_endproc > >> > >> I updated the commit message with > >> > >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect > >> branch via register whenever -mindirect-branch= is used. > >> > >> OK for trunk? > >> > >> Thanks. > >> > >> -- > >> H.J. > > > >> From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001 > >> From: "H.J. Lu" > >> Date: Fri, 26 Jan 2018 15:54:25 -0800 > >> Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER > >> > >> For > >> > >> --- > >> struct C { > >> virtual ~C(); > >> virtual void f(); > >> }; > >> > >> void > >> f (C *p) > >> { > >> p->f(); > >> p->f(); > >> } > >> --- > >> > >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > >> > >> _Z1fP1C: > >> .LFB0: > >> .cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> jmp .LIND1 > >> .LIND0: > >> pushq 16(%rax) > >> jmp __x86_indirect_thunk > >> .LIND1: > >> call.LIND0 > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> jmp __x86_indirect_thunk_rax > >> .cfi_endproc > >> > >> x86-64 is supposed to have asynchronous unwind tables by default, but > >> there is nothing that reflects the change in the (relative) frame > >> address after .LIND0. That region really has to be moved outside of > >> the .cfi_startproc/.cfi_endproc bracket. > >> > >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect > >> branch via register whenever -mindirect-branch= is used. Now, > >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > >> > >> _Z1fP1C: > >> .LFB0: > >> .cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> movq16(%rax), %rax > >> call__x86_indirect_thunk_rax > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> jmp __x86_indirect_thunk_rax > >> .cfi_endproc > >> > >> so that "-mindirect-branch=thunk-extern" is equivalent to > >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is > >> used by Linux kernel. > >> > >> gcc/ > >> > >> PR target/84039 > >> * config/i386/constraints.md (Bs): Replace > >> ix86_indirect_branch_register with > >> TARGET_INDIRECT_BRANCH_REGISTER. > >> (Bw): Likewise. > >> *
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Mon, Feb 26, 2018 at 8:54 AM, Jan Hubickawrote: >> On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubicka wrote: >> > Hi, >> > my main concern about the patch is that we now have >> > -mindirect-branch=thunk-extern >> > which is intended to work well and is used by kernel, but we also have >> > other modes >> > that are documented and as such they should work but they may lead to >> > invalid >> > unwind info (or did I miss anything imporant here?). >> > Why we can't fix the others as well? >> > >> >> I took a closer look at my commit message. It does leave an impression that >> only -mindirect-branch=thunk-extern is fixed. But in fact, all >> -mindirect-branch= >> choices are fixed. > > I see, sorry for confussion! >> >> 1. -mindirect-branch=thunk generates: >> >>.cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> movq16(%rax), %rax >> call__x86_indirect_thunk_rax >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> jmp __x86_indirect_thunk_rax >> .cfi_endproc >> >> 2. -mindirect-branch=thunk-inline generates: >> >>.cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> movq16(%rax), %rax >> jmp .LIND1 >> .LIND0: >> call.LIND3 >> .LIND2: >> pause >> lfence >> jmp .LIND2 >> .LIND3: >> mov %rax, (%rsp) >> ret >> .LIND1: >> call.LIND0 >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> call.LIND5 >> .LIND4: >> pause >> lfence >> jmp .LIND4 >> .LIND5: >> mov %rax, (%rsp) >> ret >> .cfi_endproc >> >> I updated the commit message with >> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect >> branch via register whenever -mindirect-branch= is used. >> >> OK for trunk? >> >> Thanks. >> >> -- >> H.J. > >> From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001 >> From: "H.J. Lu" >> Date: Fri, 26 Jan 2018 15:54:25 -0800 >> Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER >> >> For >> >> --- >> struct C { >> virtual ~C(); >> virtual void f(); >> }; >> >> void >> f (C *p) >> { >> p->f(); >> p->f(); >> } >> --- >> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> _Z1fP1C: >> .LFB0: >> .cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> jmp .LIND1 >> .LIND0: >> pushq 16(%rax) >> jmp __x86_indirect_thunk >> .LIND1: >> call.LIND0 >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> jmp __x86_indirect_thunk_rax >> .cfi_endproc >> >> x86-64 is supposed to have asynchronous unwind tables by default, but >> there is nothing that reflects the change in the (relative) frame >> address after .LIND0. That region really has to be moved outside of >> the .cfi_startproc/.cfi_endproc bracket. >> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect >> branch via register whenever -mindirect-branch= is used. Now, >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> _Z1fP1C: >> .LFB0: >> .cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> movq16(%rax), %rax >> call__x86_indirect_thunk_rax >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> jmp __x86_indirect_thunk_rax >> .cfi_endproc >> >> so that "-mindirect-branch=thunk-extern" is equivalent to >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is >> used by Linux kernel. >> >> gcc/ >> >> PR target/84039 >> * config/i386/constraints.md (Bs): Replace >> ix86_indirect_branch_register with >> TARGET_INDIRECT_BRANCH_REGISTER. >> (Bw): Likewise. >> * config/i386/i386.md (indirect_jump): Likewise. >> (tablejump): Likewise. >> (*sibcall_memory): Likewise. >> (*sibcall_value_memory): Likewise. >> Peepholes of indirect call and jump via memory: Likewise. >> (*sibcall_GOT_32): Disallowed for TARGET_INDIRECT_BRANCH_REGISTER. >> (*sibcall_value_GOT_32): Likewise.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
> On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubickawrote: > > Hi, > > my main concern about the patch is that we now have > > -mindirect-branch=thunk-extern > > which is intended to work well and is used by kernel, but we also have > > other modes > > that are documented and as such they should work but they may lead to > > invalid > > unwind info (or did I miss anything imporant here?). > > Why we can't fix the others as well? > > > > I took a closer look at my commit message. It does leave an impression that > only -mindirect-branch=thunk-extern is fixed. But in fact, all > -mindirect-branch= > choices are fixed. I see, sorry for confussion! > > 1. -mindirect-branch=thunk generates: > >.cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movq(%rdi), %rax > movq%rdi, %rbx > movq16(%rax), %rax > call__x86_indirect_thunk_rax > movq(%rbx), %rax > movq%rbx, %rdi > popq%rbx > .cfi_def_cfa_offset 8 > movq16(%rax), %rax > jmp __x86_indirect_thunk_rax > .cfi_endproc > > 2. -mindirect-branch=thunk-inline generates: > >.cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movq(%rdi), %rax > movq%rdi, %rbx > movq16(%rax), %rax > jmp .LIND1 > .LIND0: > call.LIND3 > .LIND2: > pause > lfence > jmp .LIND2 > .LIND3: > mov %rax, (%rsp) > ret > .LIND1: > call.LIND0 > movq(%rbx), %rax > movq%rbx, %rdi > popq%rbx > .cfi_def_cfa_offset 8 > movq16(%rax), %rax > call.LIND5 > .LIND4: > pause > lfence > jmp .LIND4 > .LIND5: > mov %rax, (%rsp) > ret > .cfi_endproc > > I updated the commit message with > > This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect > branch via register whenever -mindirect-branch= is used. > > OK for trunk? > > Thanks. > > -- > H.J. > From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" > Date: Fri, 26 Jan 2018 15:54:25 -0800 > Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER > > For > > --- > struct C { > virtual ~C(); > virtual void f(); > }; > > void > f (C *p) > { > p->f(); > p->f(); > } > --- > > -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > > _Z1fP1C: > .LFB0: > .cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movq(%rdi), %rax > movq%rdi, %rbx > jmp .LIND1 > .LIND0: > pushq 16(%rax) > jmp __x86_indirect_thunk > .LIND1: > call.LIND0 > movq(%rbx), %rax > movq%rbx, %rdi > popq%rbx > .cfi_def_cfa_offset 8 > movq16(%rax), %rax > jmp __x86_indirect_thunk_rax > .cfi_endproc > > x86-64 is supposed to have asynchronous unwind tables by default, but > there is nothing that reflects the change in the (relative) frame > address after .LIND0. That region really has to be moved outside of > the .cfi_startproc/.cfi_endproc bracket. > > This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect > branch via register whenever -mindirect-branch= is used. Now, > -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > > _Z1fP1C: > .LFB0: > .cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movq(%rdi), %rax > movq%rdi, %rbx > movq16(%rax), %rax > call__x86_indirect_thunk_rax > movq(%rbx), %rax > movq%rbx, %rdi > popq%rbx > .cfi_def_cfa_offset 8 > movq16(%rax), %rax > jmp __x86_indirect_thunk_rax > .cfi_endproc > > so that "-mindirect-branch=thunk-extern" is equivalent to > "-mindirect-branch=thunk-extern -mindirect-branch-register", which is > used by Linux kernel. > > gcc/ > > PR target/84039 > * config/i386/constraints.md (Bs): Replace > ix86_indirect_branch_register with > TARGET_INDIRECT_BRANCH_REGISTER. > (Bw): Likewise. > * config/i386/i386.md (indirect_jump): Likewise. > (tablejump): Likewise. > (*sibcall_memory): Likewise. > (*sibcall_value_memory): Likewise. > Peepholes of indirect call and jump via memory: Likewise. > (*sibcall_GOT_32): Disallowed for TARGET_INDIRECT_BRANCH_REGISTER. > (*sibcall_value_GOT_32): Likewise. > * config/i386/i386.opt: Likewise. > * config/i386/predicates.md (indirect_branch_operand): Likewise. > (GOT_memory_operand): Likewise. > (call_insn_operand): Likewise. >
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubickawrote: > Hi, > my main concern about the patch is that we now have > -mindirect-branch=thunk-extern > which is intended to work well and is used by kernel, but we also have other > modes > that are documented and as such they should work but they may lead to invalid > unwind info (or did I miss anything imporant here?). > Why we can't fix the others as well? > I took a closer look at my commit message. It does leave an impression that only -mindirect-branch=thunk-extern is fixed. But in fact, all -mindirect-branch= choices are fixed. 1. -mindirect-branch=thunk generates: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq(%rdi), %rax movq%rdi, %rbx movq16(%rax), %rax call__x86_indirect_thunk_rax movq(%rbx), %rax movq%rbx, %rdi popq%rbx .cfi_def_cfa_offset 8 movq16(%rax), %rax jmp __x86_indirect_thunk_rax .cfi_endproc 2. -mindirect-branch=thunk-inline generates: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq(%rdi), %rax movq%rdi, %rbx movq16(%rax), %rax jmp .LIND1 .LIND0: call.LIND3 .LIND2: pause lfence jmp .LIND2 .LIND3: mov %rax, (%rsp) ret .LIND1: call.LIND0 movq(%rbx), %rax movq%rbx, %rdi popq%rbx .cfi_def_cfa_offset 8 movq16(%rax), %rax call.LIND5 .LIND4: pause lfence jmp .LIND4 .LIND5: mov %rax, (%rsp) ret .cfi_endproc I updated the commit message with This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect branch via register whenever -mindirect-branch= is used. OK for trunk? Thanks. -- H.J. From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Fri, 26 Jan 2018 15:54:25 -0800 Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER For --- struct C { virtual ~C(); virtual void f(); }; void f (C *p) { p->f(); p->f(); } --- -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: _Z1fP1C: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq(%rdi), %rax movq%rdi, %rbx jmp .LIND1 .LIND0: pushq 16(%rax) jmp __x86_indirect_thunk .LIND1: call.LIND0 movq(%rbx), %rax movq%rbx, %rdi popq%rbx .cfi_def_cfa_offset 8 movq16(%rax), %rax jmp __x86_indirect_thunk_rax .cfi_endproc x86-64 is supposed to have asynchronous unwind tables by default, but there is nothing that reflects the change in the (relative) frame address after .LIND0. That region really has to be moved outside of the .cfi_startproc/.cfi_endproc bracket. This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect branch via register whenever -mindirect-branch= is used. Now, -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: _Z1fP1C: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq (%rdi), %rax movq %rdi, %rbx movq 16(%rax), %rax call __x86_indirect_thunk_rax movq (%rbx), %rax movq %rbx, %rdi popq %rbx .cfi_def_cfa_offset 8 movq 16(%rax), %rax jmp __x86_indirect_thunk_rax .cfi_endproc so that "-mindirect-branch=thunk-extern" is equivalent to "-mindirect-branch=thunk-extern -mindirect-branch-register", which is used by Linux kernel. gcc/ PR target/84039 * config/i386/constraints.md (Bs): Replace ix86_indirect_branch_register with TARGET_INDIRECT_BRANCH_REGISTER. (Bw): Likewise. * config/i386/i386.md (indirect_jump): Likewise. (tablejump): Likewise. (*sibcall_memory): Likewise. (*sibcall_value_memory): Likewise. Peepholes of indirect call and jump via memory: Likewise. (*sibcall_GOT_32): Disallowed for TARGET_INDIRECT_BRANCH_REGISTER. (*sibcall_value_GOT_32): Likewise. * config/i386/i386.opt: Likewise. * config/i386/predicates.md (indirect_branch_operand): Likewise. (GOT_memory_operand): Likewise. (call_insn_operand): Likewise. (sibcall_insn_operand): Likewise. (GOT32_symbol_operand): Likewise. * config/i386/i386.h (TARGET_INDIRECT_BRANCH_REGISTER): New. gcc/testsuite/ PR target/84039 * gcc.target/i386/indirect-thunk-1.c: Updated. * gcc.target/i386/indirect-thunk-2.c: Likewise. * gcc.target/i386/indirect-thunk-3.c: Likewise. * gcc.target/i386/indirect-thunk-4.c: Likewise. * gcc.target/i386/indirect-thunk-5.c: Likewise. * gcc.target/i386/indirect-thunk-6.c: Likewise. * gcc.target/i386/indirect-thunk-7.c: Likewise. * gcc.target/i386/indirect-thunk-attr-1.c: Likewise. *
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
Hi, my main concern about the patch is that we now have -mindirect-branch=thunk-extern which is intended to work well and is used by kernel, but we also have other modes that are documented and as such they should work but they may lead to invalid unwind info (or did I miss anything imporant here?). Why we can't fix the others as well? Honza
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Thu, Feb 22, 2018 at 9:32 AM, H.J. Luwrote: > On Thu, Feb 22, 2018 at 9:10 AM, Jeff Law wrote: >> On 02/22/2018 07:38 AM, Jan Hubicka wrote: >> >> Hi Jan, >> >> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html >> >> Is OK for trunk? > > I see that using register makes the problem go away and pushing address > to stack > seemed bit odd anyway. However how does this work on other types of thunk? Kernel only uses -mindirect-branch=thunk-extern. I am working on a proposal to use -mindirect-branch=thunk-extern in user space to support CET in a single binary. So at the end of the day, only -mindirect-branch=thunk-extern will be used. >>> >>> OK, so it is about the fact that we do not really want to support all >>> -mindirect-branch options in the future? If we don't want to support the >>> correctly, >>> I wonder why we are including them at all. Shall we at least output >>> warning/sorry >>> when user tries other thunk type with stack unwinding enabled? >>> (does Kernel use it?) >> A few notes. >> >> 1. It's not even clear at this time that retpolining user space binaries >> makes any sense at all. SO before doing anything to make this easier >> I'd like to see a justification for why it's really needed. > > Hi Jeff, > > Which part were commenting? My patch to add TARGET_INDIRECT_BRANCH_REGISTER > or removing -mindirect-branch choices? Is my patch OK for trunk? >> 2. On the other hand, the existing thunk options do make it easier to >> test independent of hte kernel. ie, I can turn on inline thunks by >> default and test things in user space (ie, do thunks generally work >> properly). > > It sounds reasonable. > Thanks. -- H.J.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Thu, Feb 22, 2018 at 9:10 AM, Jeff Lawwrote: > On 02/22/2018 07:38 AM, Jan Hubicka wrote: > > Hi Jan, > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html > > Is OK for trunk? I see that using register makes the problem go away and pushing address to stack seemed bit odd anyway. However how does this work on other types of thunk? >>> >>> Kernel only uses -mindirect-branch=thunk-extern. I am working on a >>> proposal >>> to use -mindirect-branch=thunk-extern in user space to support CET in a >>> single >>> binary. So at the end of the day, only >>> -mindirect-branch=thunk-extern will be used. >> >> OK, so it is about the fact that we do not really want to support all >> -mindirect-branch options in the future? If we don't want to support the >> correctly, >> I wonder why we are including them at all. Shall we at least output >> warning/sorry >> when user tries other thunk type with stack unwinding enabled? >> (does Kernel use it?) > A few notes. > > 1. It's not even clear at this time that retpolining user space binaries > makes any sense at all. SO before doing anything to make this easier > I'd like to see a justification for why it's really needed. Hi Jeff, Which part were commenting? My patch to add TARGET_INDIRECT_BRANCH_REGISTER or removing -mindirect-branch choices? > 2. On the other hand, the existing thunk options do make it easier to > test independent of hte kernel. ie, I can turn on inline thunks by > default and test things in user space (ie, do thunks generally work > properly). It sounds reasonable. -- H.J.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On 02/22/2018 07:38 AM, Jan Hubicka wrote: Hi Jan, https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html Is OK for trunk? >>> >>> I see that using register makes the problem go away and pushing address to >>> stack >>> seemed bit odd anyway. However how does this work on other types of thunk? >> >> Kernel only uses -mindirect-branch=thunk-extern. I am working on a proposal >> to use -mindirect-branch=thunk-extern in user space to support CET in a >> single >> binary. So at the end of the day, only >> -mindirect-branch=thunk-extern will be used. > > OK, so it is about the fact that we do not really want to support all > -mindirect-branch options in the future? If we don't want to support the > correctly, > I wonder why we are including them at all. Shall we at least output > warning/sorry > when user tries other thunk type with stack unwinding enabled? > (does Kernel use it?) A few notes. 1. It's not even clear at this time that retpolining user space binaries makes any sense at all. SO before doing anything to make this easier I'd like to see a justification for why it's really needed. 2. On the other hand, the existing thunk options do make it easier to test independent of hte kernel. ie, I can turn on inline thunks by default and test things in user space (ie, do thunks generally work properly). Jeff
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Thu, Feb 22, 2018 at 6:38 AM, Jan Hubickawrote: >> >> >> >> Hi Jan, >> >> >> >> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html >> >> >> >> Is OK for trunk? >> > >> > I see that using register makes the problem go away and pushing address to >> > stack >> > seemed bit odd anyway. However how does this work on other types of thunk? >> >> Kernel only uses -mindirect-branch=thunk-extern. I am working on a proposal >> to use -mindirect-branch=thunk-extern in user space to support CET in a >> single >> binary. So at the end of the day, only >> -mindirect-branch=thunk-extern will be used. > > OK, so it is about the fact that we do not really want to support all > -mindirect-branch options in the future? If we don't want to support the > correctly, > I wonder why we are including them at all. Shall we at least output > warning/sorry -mindirect-branch= implementation changed over time while experimenting different approaches. I didn't remove the old ones when I added new ones. We should remove others and only keep -mindirect-branch=thunk-extern with my proposal to support CET with -mindirect-branch=thunk-extern. > when user tries other thunk type with stack unwinding enabled? This is on my TODO list. > (does Kernel use it?) No, kernel doesn't use it. -- H.J.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
> >> > >> Hi Jan, > >> > >> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html > >> > >> Is OK for trunk? > > > > I see that using register makes the problem go away and pushing address to > > stack > > seemed bit odd anyway. However how does this work on other types of thunk? > > Kernel only uses -mindirect-branch=thunk-extern. I am working on a proposal > to use -mindirect-branch=thunk-extern in user space to support CET in a single > binary. So at the end of the day, only > -mindirect-branch=thunk-extern will be used. OK, so it is about the fact that we do not really want to support all -mindirect-branch options in the future? If we don't want to support the correctly, I wonder why we are including them at all. Shall we at least output warning/sorry when user tries other thunk type with stack unwinding enabled? (does Kernel use it?) Honza > > > -- > H.J.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Thu, Feb 22, 2018 at 6:29 AM, Jan Hubickawrote: >> On Sun, Jan 28, 2018 at 11:56 AM, H.J. Lu wrote: >> > On Sat, Jan 27, 2018 at 2:12 PM, H.J. Lu wrote: >> >> For >> >> >> >> --- >> >> struct C { >> >> virtual ~C(); >> >> virtual void f(); >> >> }; >> >> >> >> void >> >> f (C *p) >> >> { >> >> p->f(); >> >> p->f(); >> >> } >> >> --- >> >> >> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> >> >> _Z1fP1C: >> >> .LFB0: >> >> .cfi_startproc >> >> pushq %rbx >> >> .cfi_def_cfa_offset 16 >> >> .cfi_offset 3, -16 >> >> movq(%rdi), %rax >> >> movq%rdi, %rbx >> >> jmp .LIND1 >> >> .LIND0: >> >> pushq 16(%rax) >> >> jmp __x86_indirect_thunk >> >> .LIND1: >> >> call.LIND0 >> >> movq(%rbx), %rax >> >> movq%rbx, %rdi >> >> popq%rbx >> >> .cfi_def_cfa_offset 8 >> >> movq16(%rax), %rax >> >> jmp __x86_indirect_thunk_rax >> >> .cfi_endproc >> >> >> >> x86-64 is supposed to have asynchronous unwind tables by default, but >> >> there is nothing that reflects the change in the (relative) frame >> >> address after .LIND0. That region really has to be moved outside of >> >> the .cfi_startproc/.cfi_endproc bracket. >> >> >> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect >> >> branch via register when -mindirect-branch=thunk-extern is used. Now, >> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> >> >> _Z1fP1C: >> >> .LFB0: >> >> .cfi_startproc >> >> pushq %rbx >> >> .cfi_def_cfa_offset 16 >> >> .cfi_offset 3, -16 >> >> movq(%rdi), %rax >> >> movq%rdi, %rbx >> >> movq16(%rax), %rax >> >> call__x86_indirect_thunk_rax >> >> movq(%rbx), %rax >> >> movq%rbx, %rdi >> >> popq%rbx >> >> .cfi_def_cfa_offset 8 >> >> movq16(%rax), %rax >> >> jmp __x86_indirect_thunk_rax >> >> .cfi_endproc >> >> >> >> Now "-mindirect-branch=thunk-extern" is equivalent to >> >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is >> >> used by Linux kernel. >> >> >> >> Tested on i686 and x86-64. OK for trunk? >> >> >> >> Thanks. >> >> >> >> H.J. >> >> >> >> gcc/ >> >> >> >> PR target/84039 >> >> * config/i386/constraints.md (Bs): Replace >> >> ix86_indirect_branch_register with >> >> TARGET_INDIRECT_BRANCH_REGISTER. >> >> (Bw): Likewise. >> >> * config/i386/i386.md (indirect_jump): Likewise. >> >> (tablejump): Likewise. >> >> (*sibcall_memory): Likewise. >> >> (*sibcall_value_memory): Likewise. >> >> Peepholes of indirect call and jump via memory: Likewise. >> >> * config/i386/i386.opt: Likewise. >> >> * config/i386/predicates.md (indirect_branch_operand): Likewise. >> >> (GOT_memory_operand): Likewise. >> >> (call_insn_operand): Likewise. >> >> (sibcall_insn_operand): Likewise. >> >> (GOT32_symbol_operand): Likewise. >> >> * config/i386/i386.h (TARGET_INDIRECT_BRANCH_REGISTER): New. >> >> >> > >> > Here is the updated patch to disallow *sibcall_GOT_32 and >> > *sibcall_value_GOT_32 >> > for TARGET_INDIRECT_BRANCH_REGISTER. >> > >> > Tested on i686 and x86-64. OK for trunk? >> > >> >> Hi Jan, >> >> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html >> >> Is OK for trunk? > > I see that using register makes the problem go away and pushing address to > stack > seemed bit odd anyway. However how does this work on other types of thunk? Kernel only uses -mindirect-branch=thunk-extern. I am working on a proposal to use -mindirect-branch=thunk-extern in user space to support CET in a single binary. So at the end of the day, only -mindirect-branch=thunk-extern will be used. -- H.J.
Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
> On Sun, Jan 28, 2018 at 11:56 AM, H.J. Luwrote: > > On Sat, Jan 27, 2018 at 2:12 PM, H.J. Lu wrote: > >> For > >> > >> --- > >> struct C { > >> virtual ~C(); > >> virtual void f(); > >> }; > >> > >> void > >> f (C *p) > >> { > >> p->f(); > >> p->f(); > >> } > >> --- > >> > >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > >> > >> _Z1fP1C: > >> .LFB0: > >> .cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> jmp .LIND1 > >> .LIND0: > >> pushq 16(%rax) > >> jmp __x86_indirect_thunk > >> .LIND1: > >> call.LIND0 > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> jmp __x86_indirect_thunk_rax > >> .cfi_endproc > >> > >> x86-64 is supposed to have asynchronous unwind tables by default, but > >> there is nothing that reflects the change in the (relative) frame > >> address after .LIND0. That region really has to be moved outside of > >> the .cfi_startproc/.cfi_endproc bracket. > >> > >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect > >> branch via register when -mindirect-branch=thunk-extern is used. Now, > >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: > >> > >> _Z1fP1C: > >> .LFB0: > >> .cfi_startproc > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> .cfi_offset 3, -16 > >> movq(%rdi), %rax > >> movq%rdi, %rbx > >> movq16(%rax), %rax > >> call__x86_indirect_thunk_rax > >> movq(%rbx), %rax > >> movq%rbx, %rdi > >> popq%rbx > >> .cfi_def_cfa_offset 8 > >> movq16(%rax), %rax > >> jmp __x86_indirect_thunk_rax > >> .cfi_endproc > >> > >> Now "-mindirect-branch=thunk-extern" is equivalent to > >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is > >> used by Linux kernel. > >> > >> Tested on i686 and x86-64. OK for trunk? > >> > >> Thanks. > >> > >> H.J. > >> > >> gcc/ > >> > >> PR target/84039 > >> * config/i386/constraints.md (Bs): Replace > >> ix86_indirect_branch_register with > >> TARGET_INDIRECT_BRANCH_REGISTER. > >> (Bw): Likewise. > >> * config/i386/i386.md (indirect_jump): Likewise. > >> (tablejump): Likewise. > >> (*sibcall_memory): Likewise. > >> (*sibcall_value_memory): Likewise. > >> Peepholes of indirect call and jump via memory: Likewise. > >> * config/i386/i386.opt: Likewise. > >> * config/i386/predicates.md (indirect_branch_operand): Likewise. > >> (GOT_memory_operand): Likewise. > >> (call_insn_operand): Likewise. > >> (sibcall_insn_operand): Likewise. > >> (GOT32_symbol_operand): Likewise. > >> * config/i386/i386.h (TARGET_INDIRECT_BRANCH_REGISTER): New. > >> > > > > Here is the updated patch to disallow *sibcall_GOT_32 and > > *sibcall_value_GOT_32 > > for TARGET_INDIRECT_BRANCH_REGISTER. > > > > Tested on i686 and x86-64. OK for trunk? > > > > Hi Jan, > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html > > Is OK for trunk? I see that using register makes the problem go away and pushing address to stack seemed bit odd anyway. However how does this work on other types of thunk? Honza
PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
On Sun, Jan 28, 2018 at 11:56 AM, H.J. Luwrote: > On Sat, Jan 27, 2018 at 2:12 PM, H.J. Lu wrote: >> For >> >> --- >> struct C { >> virtual ~C(); >> virtual void f(); >> }; >> >> void >> f (C *p) >> { >> p->f(); >> p->f(); >> } >> --- >> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> _Z1fP1C: >> .LFB0: >> .cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> jmp .LIND1 >> .LIND0: >> pushq 16(%rax) >> jmp __x86_indirect_thunk >> .LIND1: >> call.LIND0 >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> jmp __x86_indirect_thunk_rax >> .cfi_endproc >> >> x86-64 is supposed to have asynchronous unwind tables by default, but >> there is nothing that reflects the change in the (relative) frame >> address after .LIND0. That region really has to be moved outside of >> the .cfi_startproc/.cfi_endproc bracket. >> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect >> branch via register when -mindirect-branch=thunk-extern is used. Now, >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates: >> >> _Z1fP1C: >> .LFB0: >> .cfi_startproc >> pushq %rbx >> .cfi_def_cfa_offset 16 >> .cfi_offset 3, -16 >> movq(%rdi), %rax >> movq%rdi, %rbx >> movq16(%rax), %rax >> call__x86_indirect_thunk_rax >> movq(%rbx), %rax >> movq%rbx, %rdi >> popq%rbx >> .cfi_def_cfa_offset 8 >> movq16(%rax), %rax >> jmp __x86_indirect_thunk_rax >> .cfi_endproc >> >> Now "-mindirect-branch=thunk-extern" is equivalent to >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is >> used by Linux kernel. >> >> Tested on i686 and x86-64. OK for trunk? >> >> Thanks. >> >> H.J. >> >> gcc/ >> >> PR target/84039 >> * config/i386/constraints.md (Bs): Replace >> ix86_indirect_branch_register with >> TARGET_INDIRECT_BRANCH_REGISTER. >> (Bw): Likewise. >> * config/i386/i386.md (indirect_jump): Likewise. >> (tablejump): Likewise. >> (*sibcall_memory): Likewise. >> (*sibcall_value_memory): Likewise. >> Peepholes of indirect call and jump via memory: Likewise. >> * config/i386/i386.opt: Likewise. >> * config/i386/predicates.md (indirect_branch_operand): Likewise. >> (GOT_memory_operand): Likewise. >> (call_insn_operand): Likewise. >> (sibcall_insn_operand): Likewise. >> (GOT32_symbol_operand): Likewise. >> * config/i386/i386.h (TARGET_INDIRECT_BRANCH_REGISTER): New. >> > > Here is the updated patch to disallow *sibcall_GOT_32 and > *sibcall_value_GOT_32 > for TARGET_INDIRECT_BRANCH_REGISTER. > > Tested on i686 and x86-64. OK for trunk? > Hi Jan, https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html Is OK for trunk? Thanks. -- H.J.