On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu <[email protected]> wrote:
>
> On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote:
> > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <[email protected]> wrote:
> > >
> > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:
> > > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <[email protected]> wrote:
> > > > >
> > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <[email protected]>
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <[email protected]> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <[email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <[email protected]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Since nested function isn't only called directly, there is
> > > > > > > > > ENDBR32 at
> > > > > > > > > function entry and we need to skip it for direct jump in
> > > > > > > > > trampoline.
> > > > > > > >
> > > > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps
> > > > > > > > rephrase it?
> > > > > > > >
> > > > > > >
> > > > > > > ix86_trampoline_init has
> > > > > > >
> > > > > > > /* Compute offset from the end of the jmp to the target
> > > > > > > function.
> > > > > > > In the case in which the trampoline stores the static
> > > > > > > chain on
> > > > > > > the stack, we need to skip the first insn which pushes
> > > > > > > the
> > > > > > > (call-saved) register static chain; this push is 1 byte.
> > > > > > > */
> > > > > > > offset += 5;
> > > > > > > disp = expand_binop (SImode, sub_optab, fnaddr,
> > > > > > > plus_constant (Pmode, XEXP (m_tramp,
> > > > > > > 0),
> > > > > > > offset - (MEM_P (chain)
> > > > > > > ? 1 : 0)),
> > > > > > > NULL_RTX, 1, OPTAB_DIRECT);
> > > > > > > emit_move_insn (mem, disp);
> > > > > > >
> > > > > > > Without CET, we got
> > > > > > >
> > > > > > > 0000011 <bar.1878>:
> > > > > > > 11: 56 push %esi
> > > > > > > 12: 55 push %ebp <<<<<< trampoline jumps
> > > > > > > here.
> > > > > > > 13: 89 e5 mov %esp,%ebp
> > > > > > > 15: 83 ec 08 sub $0x8,%esp
> > > > > > >
> > > > > > > With CET, if bar isn't only called directly, we got
> > > > > > >
> > > > > > > 00000015 <bar.1878>:
> > > > > > > 15: f3 0f 1e fb endbr32
> > > > > > > 19: 56 push %esi
> > > > > > > 1a: 55 push %ebp <<<<<<<< trampoline
> > > > > > > jumps here.
> > > > > > > 1b: 89 e5 mov %esp,%ebp
> > > > > > > 1d: 83 ec 08 sub $0x8,%esp
> > > > > > >
> > > > > > > We need to add 4 bytes for trampoline to skip endbr32.
> > > > > > >
> > > > > > > Here is the updated patch to check if nested function isn't only
> > > > > > > called directly,
> > > > > >
> > > > > > Please figure out the final patch. I don't want to waste my time
> > > > > > reviewing different version every half hour. Ping me in a couple of
> > > > > > days.
> > > > >
> > > > > This is the final version:
> > > > >
> > > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html
> > > > >
> > > > > You can try the testcase in the patch on any machine with CET binutils
> > > > > since ENDBR32 is nop on none-CET machines. Without this patch,
> > > > > the test will fail.
> > > >
> > > > Please rephrase the comment. I don't understand what it tries to say.
> > > >
> > >
> > > Here is the patch with updated comments. OK for master and 8/9 branches?
> > >
> > > Thanks.
> > >
> > > H.J.
> > > ---
> > > Since pass_insert_endbranch inserts ENDBR32 at entry of the target
> > > function if it may be called indirectly, we also need to skip the
> > > 4-byte ENDBR32 for direct jump in trampoline if it is the case.
> >
> > "
> > Skip ENDBR32 at the entry if called indirectly.
> > "
> > pass_insert_endbranch inserts ENDBR at the entry of the target
> > function. We need to skip
> > >
> > > Tested on Linux/x86-64 CET machine with and without -m32.
> > >
> > > gcc/
> > >
> > > PR target/93656
> > > * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
> > > the target function entry.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/93656
> > > * gcc.target/i386/pr93656.c: New test.
> > > ---
> > > gcc/config/i386/i386.c | 9 ++++++++-
> > > gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
> > > 2 files changed, 12 insertions(+), 1 deletion(-)
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c
> > >
> > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > > index 44bc0e0176a..52640b74cc8 100644
> > > --- a/gcc/config/i386/i386.c
> > > +++ b/gcc/config/i386/i386.c
> > > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl,
> > > rtx chain_value)
> > > the stack, we need to skip the first insn which pushes the
> > > (call-saved) register static chain; this push is 1 byte. */
> > > offset += 5;
> > > + int skip = MEM_P (chain) ? 1 : 0;
> >
> > offset += 5 - MEM_P (chain) ? 1: 0;
>
> This is done on purpose sinc there is
>
> gcc_assert (offset <= TRAMPOLINE_SIZE);
>
> after it. We shouldn't update offset.
>
> >
> > > + /* Since pass_insert_endbranch inserts ENDBR32 at entry of the
> > > + target function if it may be called indirectly, we also need
> > > + to skip the 4-byte ENDBR32 if it is the case. */
> >
> > /* Skip ENDBR32 at the entry of the target function when called indirectly.
> > */
>
> Fixed.
>
> >
> > > + if (need_endbr
> > > + && !cgraph_node::get (fndecl)->only_called_directly_p ())
> > > + skip += 4;
> >
> > offset += 4;
> >
> > > disp = expand_binop (SImode, sub_optab, fnaddr,
> > > plus_constant (Pmode, XEXP (m_tramp, 0),
> > > - offset - (MEM_P (chain) ? 1 :
> > > 0)),
> > > + offset - skip),
> >
> > plus_constant (Pmode, XEXP (m_tramp, 0), offset),
> >
> > Uros.
> >
>
> Here is the updated patch. OK for master and 8/9 branches?
>
> Thanks.
>
>
> H.J.
> --
> Skip ENDBR32 at the entry of the target function when called indirectly.
Hm, this was sent too early, I was trying to say:
"Skip ENDBR32 at the entry of target function when initializing trampoline."
>
> Tested on Linux/x86-64 CET machine with and without -m32.
>
> gcc/
>
> PR target/93656
> * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
> the target function entry.
>
> gcc/testsuite/
>
> PR target/93656
> * gcc.target/i386/pr93656.c: New test.
LGTM for mainline, please wait a week before backporting.
Uros.
> ---
> gcc/config/i386/i386.c | 8 +++++++-
> gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
> 2 files changed, 11 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 44bc0e0176a..3f3dcd53eb2 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx
> chain_value)
> the stack, we need to skip the first insn which pushes the
> (call-saved) register static chain; this push is 1 byte. */
> offset += 5;
> + int skip = MEM_P (chain) ? 1 : 0;
> + /* Skip ENDBR32 at the entry of the target function when called
> + indirectly. */
Just say "Skip ENDBR32 at the entry of the target function."
The condition below is clear, the purpose is not.
> + if (need_endbr
> + && !cgraph_node::get (fndecl)->only_called_directly_p ())
> + skip += 4;
> disp = expand_binop (SImode, sub_optab, fnaddr,
> plus_constant (Pmode, XEXP (m_tramp, 0),
> - offset - (MEM_P (chain) ? 1 : 0)),
> + offset - skip),
> NULL_RTX, 1, OPTAB_DIRECT);
> emit_move_insn (mem, disp);
> }
> diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c
> b/gcc/testsuite/gcc.target/i386/pr93656.c
> new file mode 100644
> index 00000000000..f0ac8c8edaa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr93656.c
> @@ -0,0 +1,4 @@
> +/* { dg-do run { target { ia32 && cet } } } */
> +/* { dg-options "-O2 -fcf-protection" } */
> +
> +#include "pr67770.c"
> --
> 2.24.1
>