Re: [PATCH] inline: do not inline when no_profile_instrument_function is different

2021-06-25 Thread Nick Desaulniers via Gcc-patches
On Wed, Jun 23, 2021 at 6:15 AM Martin Liška  wrote:
>
> On 6/23/21 2:38 PM, Jan Hubicka wrote:
> > Is there reason to prevent the inlining once instrumentation is done?
>
> No ;)

Here's another case that coincidentally came up yesterday. How should
these attributes behave in the case of __attribute__((flatten)) on the
caller?

Another case which is curious is what happens when the callee has
__attribute__((always_inline)) and there's a conflict?

In LLVM, the current behavior is:
"__attribute__((no_stack_protector)) and
__attribute__((no_profile_instrument_function)) will prevent inline
substitution when the attributes do not match between caller and
callee, unless the callee was attributed with
__attribute__((always_inline)) or the caller was attributed with
__attribute__((flatten))."

We have some overzealous users of always_inline and flatten in
Android. They are currently playing whack-a-mole with
no_stack_protector.  I'm not sure yet how we can better help them self
diagnose, or whether we should consider a change in policy.

I'm also not sure whether GCC's einliner corresponds with
always_inline or not necessarily?
--
Thanks,
~Nick Desaulniers


Re: [PATCH] Implement no_stack_protect attribute.

2020-10-21 Thread Nick Desaulniers via Gcc-patches
+ correct kernel mailing list this time.

On Wed, Oct 21, 2020 at 2:33 PM Nick Desaulniers
 wrote:
>
> Thanks for the quick feedback!
>
> On Wed, Oct 21, 2020 at 2:13 PM Jakub Jelinek  wrote:
> >
> > On Wed, Oct 21, 2020 at 02:04:15PM -0700, Nick Desaulniers via Gcc-patches 
> > wrote:
> > > Tangentially related question:
> > > We're running into a bug related to LTO for the kernel when code
> > > compiled with -fno-stack-protector is called from and inlined into
> > > code that is compiled with -fstack-protector.  Specifically, stack
> > > canaries get checked before they're restored post suspend/resume which
> > > leads to spooky bugs.
> > >
> > > Once we have more fine grain function level attribute to explicitly
> > > disable stack protectors on a per function basis, I'm considering
> > > making this function attribute a barrier to inlining in LLVM so that
> > > callers with stack protectors don't inline callees that explicitly
> > > should not have a stack protector and vice versa (more concretely,
> > > when they don't match).  I think this would maximize which functions
> > > are still covered by stack protectors, and be the most straightforward
> > > to implement.
> >
> > That doesn't make sense to me.
> > Stack protector doesn't affect in any way inlined code, the stack protection
> > is always solely in the prologue and epilogue of out of line functions.
> > So, if the (non-inlined) caller is -fstack-protector and inlined callee
> > is -fno-stack-protector, there should be ssp store in the prologue of the
> > resulting function and test in the epilogue.
>
> That is the case today, and I'm arguing that leads to bugs in the
> Linux kernel when built with LTO.
>
> > The effect will be exactly
> > like that if the function wouldn't be inlined.
>
> I don't follow.  If the -fno-stack-protector callee was not inlined,
> the caller would have a stack protector, while the callee would not.
> I think today there's not a strong enough distinction between the
> level of stack protection being specified vs explicit
> annotations/flags that stack protectors MUST NOT be inserted.
>
> > Similarly, if the non-inlined caller is -fno-stack-protector and inlined
> > callee is -fstack-protector, there will be no stack protection.  This isn't
>
> And I'd argue that now we may have stripped off stack protection in
> the pursuit of inlining.  Consider for example the case where that
> stack protected callee contained a large alloca; post inlining into a
> -fno-stack-protected caller suddenly now it doesn't.  Oops?
>
> Wouldn't it be safer to just prevent inlining, then the callee retains
> the stack protector, regardless of caller stack protection?
>
> > exactly the effect one would get without the inlining (as in that case
> > the callee would check it), but matches the general behavior that we allow
> > inlining functions with -fstack-protector* at all (and only check it in the
> > prologue/epilogue, not somewhere in the middle).
> >
> > Jakub
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] Implement no_stack_protect attribute.

2020-10-21 Thread Nick Desaulniers via Gcc-patches
Thanks for the quick feedback!

On Wed, Oct 21, 2020 at 2:13 PM Jakub Jelinek  wrote:
>
> On Wed, Oct 21, 2020 at 02:04:15PM -0700, Nick Desaulniers via Gcc-patches 
> wrote:
> > Tangentially related question:
> > We're running into a bug related to LTO for the kernel when code
> > compiled with -fno-stack-protector is called from and inlined into
> > code that is compiled with -fstack-protector.  Specifically, stack
> > canaries get checked before they're restored post suspend/resume which
> > leads to spooky bugs.
> >
> > Once we have more fine grain function level attribute to explicitly
> > disable stack protectors on a per function basis, I'm considering
> > making this function attribute a barrier to inlining in LLVM so that
> > callers with stack protectors don't inline callees that explicitly
> > should not have a stack protector and vice versa (more concretely,
> > when they don't match).  I think this would maximize which functions
> > are still covered by stack protectors, and be the most straightforward
> > to implement.
>
> That doesn't make sense to me.
> Stack protector doesn't affect in any way inlined code, the stack protection
> is always solely in the prologue and epilogue of out of line functions.
> So, if the (non-inlined) caller is -fstack-protector and inlined callee
> is -fno-stack-protector, there should be ssp store in the prologue of the
> resulting function and test in the epilogue.

That is the case today, and I'm arguing that leads to bugs in the
Linux kernel when built with LTO.

> The effect will be exactly
> like that if the function wouldn't be inlined.

I don't follow.  If the -fno-stack-protector callee was not inlined,
the caller would have a stack protector, while the callee would not.
I think today there's not a strong enough distinction between the
level of stack protection being specified vs explicit
annotations/flags that stack protectors MUST NOT be inserted.

> Similarly, if the non-inlined caller is -fno-stack-protector and inlined
> callee is -fstack-protector, there will be no stack protection.  This isn't

And I'd argue that now we may have stripped off stack protection in
the pursuit of inlining.  Consider for example the case where that
stack protected callee contained a large alloca; post inlining into a
-fno-stack-protected caller suddenly now it doesn't.  Oops?

Wouldn't it be safer to just prevent inlining, then the callee retains
the stack protector, regardless of caller stack protection?

> exactly the effect one would get without the inlining (as in that case
> the callee would check it), but matches the general behavior that we allow
> inlining functions with -fstack-protector* at all (and only check it in the
> prologue/epilogue, not somewhere in the middle).
>
> Jakub
>


-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] Implement no_stack_protect attribute.

2020-10-21 Thread Nick Desaulniers via Gcc-patches
On Tue, Oct 20, 2020 at 5:19 AM Richard Biener
 wrote:
>
> On Tue, Oct 20, 2020 at 1:24 PM Martin Liška  wrote:
> >
> > PING^5
>
> So can we use the same identifier as clang here as Nick
> requests?  Thus, OK with re-naming everything alongside
> no_stack_protector.  It isn't really the opposite of the
> stack_protect attribute since that only protects when
> -fstack-protector-explicit is enabled.

I'll be happy to help test and review with an updated/rebased patch.

Tangentially related question:
We're running into a bug related to LTO for the kernel when code
compiled with -fno-stack-protector is called from and inlined into
code that is compiled with -fstack-protector.  Specifically, stack
canaries get checked before they're restored post suspend/resume which
leads to spooky bugs.

Once we have more fine grain function level attribute to explicitly
disable stack protectors on a per function basis, I'm considering
making this function attribute a barrier to inlining in LLVM so that
callers with stack protectors don't inline callees that explicitly
should not have a stack protector and vice versa (more concretely,
when they don't match).  I think this would maximize which functions
are still covered by stack protectors, and be the most straightforward
to implement.

The next question then is what happens when the callee is marked
__attribute__((always_inline))?  My answer for LLVM currently is
"still disallow inline substitution" which is more surprising than I'd
like, but we already have precedent for "always inline" not meaning
"always."  Warning from the frontend when mixing no_stack_protector
and always_inline is possible if the callers are visible and don't
match, but I don't think that works for cross translation unit calls.

I guess I was curious if others have ideas for solutions to this
particular problem?  Otherwise I plan to implement the above logic in
LLVM.  We'd eventually need matching logic in GCC to support LTO
kernels not having the same bug.

https://reviews.llvm.org/D87956
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] Implement no_stack_protect attribute.

2020-08-25 Thread Nick Desaulniers via Gcc-patches
This would solve a common pattern in the kernel where folks are using
`extern inline` with `gnu_inline` semantics or worse (empty `asm("");`
statements) in certain places where it would be much more preferable
to have this attribute.  Thank you very much Martin for writing it.

> is direct equivalent of Clang's no_stack_protector.
> Unlike Clang, I chose to name it no_stack_protect because we already
> have stack_protect attribute (used with -fstack-protector-explicit).

That's pretty easy for us to work around the differences in the
kernel, but one final plea for the users; it would simplify users'
codebases not to have to shim this for differences between compilers.
If I had a dollar for every time I had to implement something in LLVM
where a different identifier or flag would be more consistent with
another part of the codebase...I'm sure there's many examples of this
on LLVM's side too, but I would prefer to stop the proliferation of
subtle differences like this that harm toolchain portability when
possible and when we can proactively address.
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

2019-09-06 Thread Nick Desaulniers via gcc-patches
On Fri, Sep 6, 2019 at 5:14 PM Segher Boessenkool
 wrote:
>
> On Fri, Sep 06, 2019 at 04:42:58PM -0700, Nick Desaulniers via gcc-patches 
> wrote:
> > Just to prove my point about version checks being brittle, it looks
> > like Rasmus' version check isn't even right.  GCC supported `asm
> > inline` back in the 8.3 release, not 9.1 as in this patch:
>
> Yes, I backported it so that it is available in 7.5, 8.3, and 9.1, so
> that more users will have this available sooner.  (7.5 has not been
> released yet, but asm inline has been supported in GCC 7 since Jan 2
> this year).

Ah, ok that makes sense.

How would you even write a version check for that?

Which looks better?

#if __GNU_MAJOR__ > 9 || __GNU_MAJOR__ == 8 && __GNU_MINOR__ >= 3 ||
__GNU_MAJOR__ == 7 && __GNU_MINOR__ >= 5 || __CLANG_MAJOR__ == 42
// make use of `asm inline`
#endif

or

#ifdef __CC_HAS_ASM_INLINE__
// make use of `asm inline`
#endif

>
> > Or was it "broken" until 9.1?  Lord knows, as `asm inline` wasn't in
> > any release notes or bug reports I can find:
>
> https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01143.html
>
> It never was accepted, and I dropped the ball.

Ah, ok, that's fine, so documentation was at least written.  Tracking
when and where patches land (or don't) is difficult when patch files
are emailed around.  I try to keep track of when and where our kernel
patches land, but I frequently drop the ball there.

> > Segher, next time there's discussion about new C extensions for the
> > kernel, can you please include me in the discussions?
>
> You can lurk on gcc-patches@ and/or gcc@?

Please "interrupt" me when you're aware of such discussions, rather
than me "polling" a mailing list.  (I will buy you a tasty beverage of
your preference).  I'm already subscribed to more mailing lists than I
have time to read.

> But I'll try to remember, sure.
> Not that I am involved in all such discussions myself, mind.

But you _did_ implement `asm inline`. ;)
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

2019-09-06 Thread Nick Desaulniers via gcc-patches
On Fri, Sep 6, 2019 at 3:03 PM Segher Boessenkool
 wrote:
>
> On Fri, Sep 06, 2019 at 11:14:08AM -0700, Nick Desaulniers wrote:
> > Here's the case that I think is perfect:
> > https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
> >
> > Specifically the feature test preprocessor define __GCC_ASM_FLAG_OUTPUTS__.
> >
> > See exactly how we handle it in the kernel:
> > - 
> > https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/asm.h#L112-L118
> > - 
> > https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/rmwcc.h#L14-L30
> >
> > Feature detection of the feature makes it trivial to detect when the
> > feature is supported, rather than brittle compiler version checks.
> > Had it been a GCC version check, it wouldn't work for clang out of the
> > box when clang added support for __GCC_ASM_FLAG_OUTPUTS__.  But since
> > we had the helpful __GCC_ASM_FLAG_OUTPUTS__, and wisely based our use
> > of the feature on that preprocessor define, the code ***just worked***
> > for compilers that didn't support the feature ***and*** compilers when
> > they did support the feature ***without changing any of the source
> > code*** being compiled.
>
> And if instead you tested whether the actual feature you need works as
> you need it to, it would even work fine if there was a bug we fixed that
> breaks things for the kernel.  Without needing a new compiler.

That assumes a feature is broken out of the gate and is putting the
cart before the horse.  If a feature is available, it should work.  If
you later find it to be unsatisfactory, sure go out of your way to add
ugly compiler-specific version checks or upgrade your minimally
supported toolchain; until then feature detection is much cleaner (see
again __GCC_ASM_FLAG_OUTPUTS__).

>
> Or as another example, if we added support for some other flags. (x86
> has only a few flags; many other archs have many more, and in some cases
> newer hardware actually has more flags than older).

I think compiler flags are orthogonal to GNU C extensions we're discussing here.

>
> With the "macro" scheme we would need to add new macros in all these
> cases.  And since those are target-specific macros, that quickly expands
> beyond reasonable bounds.

I don't think so.  Can you show me an example codebase that proves me wrong?

>
> If you want to know if you can do X in some environment, just try to do X.

That's a very autoconf centric viewpoint.  Why doesn't the kernel take
that approach for __GCC_ASM_FLAG_OUTPUTS__?  Why not repeatedly invoke
$CC to find if every compiler __attribute__ is supported?  Do you
think it's faster for the C preprocessor to check for a few #ifdefs,
or to repeatedly invoke $CC at build or compile time to detect new
features?
-- 
Thanks,
~Nick Desaulniers