On Thu, Jan 11, 2018 at 2:46 PM, Jeff Law <l...@redhat.com> wrote: > On 01/07/2018 03:59 PM, H.J. Lu wrote: >> Add -mindirect-branch= option to convert indirect call and jump to call >> and return thunks. The default is 'keep', which keeps indirect call and >> jump unmodified. 'thunk' converts indirect call and jump to call and >> return thunk. 'thunk-inline' converts indirect call and jump to inlined >> call and return thunk. 'thunk-extern' converts indirect call and jump to >> external call and return thunk provided in a separate object file. You >> can control this behavior for a specific function by using the function >> attribute indirect_branch. >> >> 2 kinds of thunks are geneated. Memory thunk where the function address >> is at the top of the stack: >> >> __x86_indirect_thunk: >> call L2 >> L1: >> lfence >> jmp L1 >> L2: >> lea 8(%rsp), %rsp|lea 4(%esp), %esp >> ret >> >> Indirect jmp via memory, "jmp mem", is converted to >> >> push memory >> jmp __x86_indirect_thunk >> >> Indirect call via memory, "call mem", is converted to >> >> jmp L2 >> L1: >> push [mem] >> jmp __x86_indirect_thunk >> L2: >> call L1 >> >> Register thunk where the function address is in a register, reg: >> >> __x86_indirect_thunk_reg: >> call L2 >> L1: >> lfence >> jmp L1 >> L2: >> movq %reg, (%rsp)|movl %reg, (%esp) >> ret >> >> where reg is one of (r|e)ax, (r|e)dx, (r|e)cx, (r|e)bx, (r|e)si, (r|e)di, >> (r|e)bp, r8, r9, r10, r11, r12, r13, r14 and r15. >> >> Indirect jmp via register, "jmp reg", is converted to >> >> jmp __x86_indirect_thunk_reg >> >> Indirect call via register, "call reg", is converted to >> >> call __x86_indirect_thunk_reg >> >> gcc/ >> >> * config/i386/i386-opts.h (indirect_branch): New. >> * config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise. >> * config/i386/i386.c (ix86_using_red_zone): Disallow red-zone >> with local indirect jump when converting indirect call and jump. >> (ix86_set_indirect_branch_type): New. >> (ix86_set_current_function): Call ix86_set_indirect_branch_type. >> (indirectlabelno): New. >> (indirect_thunk_needed): Likewise. >> (indirect_thunk_bnd_needed): Likewise. >> (indirect_thunks_used): Likewise. >> (indirect_thunks_bnd_used): Likewise. >> (INDIRECT_LABEL): Likewise. >> (indirect_thunk_name): Likewise. >> (output_indirect_thunk): Likewise. >> (output_indirect_thunk_function): Likewise. >> (ix86_output_indirect_branch): Likewise. >> (ix86_output_indirect_jmp): Likewise. >> (ix86_code_end): Call output_indirect_thunk_function if needed. >> (ix86_output_call_insn): Call ix86_output_indirect_branch if >> needed. >> (ix86_handle_fndecl_attribute): Handle indirect_branch. >> (ix86_attribute_table): Add indirect_branch. >> * config/i386/i386.h (machine_function): Add indirect_branch_type >> and has_local_indirect_jump. >> * config/i386/i386.md (indirect_jump): Set has_local_indirect_jump >> to true. >> (tablejump): Likewise. >> (*indirect_jump): Use ix86_output_indirect_jmp. >> (*tablejump_1): Likewise. >> (simple_return_indirect_internal): Likewise. >> * config/i386/i386.opt (mindirect-branch=): New option. >> (indirect_branch): New. >> (keep): Likewise. >> (thunk): Likewise. >> (thunk-inline): Likewise. >> (thunk-extern): Likewise. >> * doc/extend.texi: Document indirect_branch function attribute. >> * doc/invoke.texi: Document -mindirect-branch= option. > Note I'm expecting Uros to chime in. So please do not consider this > ack'd until you hear from Uros. > > At a high level is there really that much value in having thunks in the > object file? Why not put the full set of thunks into libgcc and just > allow selection between inline sequences and external thunks > (thunk-inline and thunk-external)? It's not a huge simplification, but > if there isn't a compelling reason, let's drop the in-object-file thunks.
I prefer to leave it in the object file just in case that -mindirect-branch-loop= is needed in the future. >> + >> +/* Fills in the label name that should be used for the indirect thunk. */ >> + >> +static void >> +indirect_thunk_name (char name[32], int regno, bool need_bnd_p) > Please document each argument in the function's comment. Will do. > > >> + >> +static void >> +output_indirect_thunk (bool need_bnd_p, int regno) > Needs a function comment. Will do. > > >> + >> +static void >> +output_indirect_thunk_function (bool need_bnd_p, int regno) > Needs a function comment. > Will do. > >> @@ -28119,12 +28357,182 @@ ix86_nopic_noplt_attribute_p (rtx call_op) >> return false; >> } >> >> +static void >> +ix86_output_indirect_branch (rtx call_op, const char *xasm, >> + bool sibcall_p) > Needs a function comment. > Will do. > I'd probably break this into a few smaller functions. It's a lot of > inlined code. > That function has 142 lines. Unless there is a compelling need, I prefer to leave it ASIS. > > > > > >> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md >> index 3f587806407..a7573c468ae 100644 >> --- a/gcc/config/i386/i386.md >> +++ b/gcc/config/i386/i386.md >> @@ -12313,12 +12313,13 @@ >> { >> if (TARGET_X32) >> operands[0] = convert_memory_address (word_mode, operands[0]); >> + cfun->machine->has_local_indirect_jump = true; > Note this is not ideal in that it's set at expansion time and thus would > not be accurate if the RTL optimizers were able to simply things enough > such that the indirect jump has a known target. > > But I wouldn't expect that to happen much in the RTL optimizers much as > the gimple optimizers are likely much better at doing that kind of > thing. So I won't object to doing things this way as long as they > gracefully handle this case. > > >> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt >> index 09aaa97c2fc..22c806206e4 100644 >> --- a/gcc/config/i386/i386.opt >> +++ b/gcc/config/i386/i386.opt >> @@ -1021,3 +1021,23 @@ indirect jump. >> mforce-indirect-call >> Target Report Var(flag_force_indirect_call) Init(0) >> Make all function calls indirect. >> + >> +mindirect-branch= >> +Target Report RejectNegative Joined Enum(indirect_branch) >> Var(ix86_indirect_branch) Init(indirect_branch_keep) >> +Convert indirect call and jump. > Convert to what? I realize there's an enum of the choices below, but > this doesn't read well. I will update. > Do you want to mention that CET and retpolines are inherently I will document it. > incompatible? Should an attempt to use them together generate a > compile-time error? > Compile-time error sounds a good idea. Thanks. -- H.J.