Re: __predict_false for pledge

2015-10-29 Thread Michael McConville
Ted Unangst wrote:
> Michael McConville wrote:
> > Ted Unangst wrote:
> > > Michael McConville wrote:
> > > > We have a pretty strong guarantee that it can only happen once
> > > > per process...
> > > 
> > > I don't think this really matters. What does it do to the
> > > assmembly, and how does that make things faster?
> > 
> > It lets the compiler know that the body is very unlikely to run so
> > that it won't unroll loops, and will maybe bump the condition body
> > to the end of the procedure, etc. It can also be used to annotate
> > the branch with a hint instruction, but I don't know how many
> > architectures still use those.
> 
> I meant in this case specifically. What is the *demonstrated* benefit?
> 
> Generally, not many fans of the annotation for the sake of annontation
> in these parts. :)

So, I broke out objdump and looked at how these differ on amd64. No
profiling yet (I've never profiled syscalls before, advice welcome).

The condition evaluation is identical. The differences:

 * the annotated version jumps on error to the end of the syscall
   procedure, while the current version has the error body directly
   after the condition (in the middle of the procedure's hot path)

 * the annotated version is slightly smaller, assumedly because the
   compiler didn't bother doing expansion-based optimizations in the
   error body.

The current version prevents speculative execution from being useful and
puts unused fluff in the I-cache.

Not super important, but these annotations are easy and pretty
consistently beneficial.



Re: __predict_false for pledge

2015-10-26 Thread Philip Guenther
On Mon, Oct 26, 2015 at 8:46 AM, Michael McConville  wrote:
> We have a pretty strong guarantee that it can only happen once per
> process...
...
> --- sys/sys/syscall_mi.h9 Oct 2015 01:17:18 -   1.11
> +++ sys/sys/syscall_mi.h26 Oct 2015 15:13:44 -
> @@ -72,7 +72,8 @@ mi_syscall(struct proc *p, register_t co
> if (lock)
> KERNEL_LOCK();
> pledged = (p->p_p->ps_flags & PS_PLEDGE);
> -   if (pledged && !(tval = pledge_check(p, code))) {
> +   if (__predict_false(
> +   pledged && !(tval = pledge_check(p, code {

I disagree.  That's the code used on every syscall, not just once per
process and pledged is true for *most* of the processes on a -current
box.  No, that doesn't mean we should do __predict_true() there.

In general, __predict_{true,false} should be left in the tool box and
only pulled out after detailed dives into code paths involved.  For
all my banging on the project, I think I've used them in *two* places.


Philip Guenther



Re: __predict_false for pledge

2015-10-26 Thread Michael McConville
Ted Unangst wrote:
> Michael McConville wrote:
> > We have a pretty strong guarantee that it can only happen once per
> > process...
> 
> I don't think this really matters. What does it do to the assmembly,
> and how does that make things faster?

It lets the compiler know that the body is very unlikely to run so that
it won't unroll loops, and will maybe bump the condition body to the end
of the procedure, etc. It can also be used to annotate the branch with a
hint instruction, but I don't know how many architectures still use
those.

Not sure how people feel about these annotations. This is a pretty
classic use case, though.



Re: __predict_false for pledge

2015-10-26 Thread Theo de Raadt
> On Mon, Oct 26, 2015 at 8:46 AM, Michael McConville  wrote:
> > We have a pretty strong guarantee that it can only happen once per
> > process...
> ...
> > --- sys/sys/syscall_mi.h9 Oct 2015 01:17:18 -   1.11
> > +++ sys/sys/syscall_mi.h26 Oct 2015 15:13:44 -
> > @@ -72,7 +72,8 @@ mi_syscall(struct proc *p, register_t co
> > if (lock)
> > KERNEL_LOCK();
> > pledged = (p->p_p->ps_flags & PS_PLEDGE);
> > -   if (pledged && !(tval = pledge_check(p, code))) {
> > +   if (__predict_false(
> > +   pledged && !(tval = pledge_check(p, code {
> 
> I disagree.  That's the code used on every syscall, not just once per
> process and pledged is true for *most* of the processes on a -current
> box.  No, that doesn't mean we should do __predict_true() there.

I have no idea what it will do on all our platforms.

> In general, __predict_{true,false} should be left in the tool box and
> only pulled out after detailed dives into code paths involved.  For
> all my banging on the project, I think I've used them in *two* places.

I suspect there is more benefit to be gained through actual tested
refactoring (with assumption: systrace is almost never used).



Re: __predict_false for pledge

2015-10-26 Thread Michael McConville
Philip Guenther wrote:
> On Mon, Oct 26, 2015 at 8:46 AM, Michael McConville  wrote:
> > We have a pretty strong guarantee that it can only happen once per
> > process...
> ...
> > --- sys/sys/syscall_mi.h9 Oct 2015 01:17:18 -   1.11
> > +++ sys/sys/syscall_mi.h26 Oct 2015 15:13:44 -
> > @@ -72,7 +72,8 @@ mi_syscall(struct proc *p, register_t co
> > if (lock)
> > KERNEL_LOCK();
> > pledged = (p->p_p->ps_flags & PS_PLEDGE);
> > -   if (pledged && !(tval = pledge_check(p, code))) {
> > +   if (__predict_false(
> > +   pledged && !(tval = pledge_check(p, code {
> 
> I disagree.  That's the code used on every syscall, not just once per
> process and pledged is true for *most* of the processes on a -current
> box.  No, that doesn't mean we should do __predict_true() there.

That's what I mean - it's used on every syscall and it can never be true
more than once per process. (That's a pledge failure, and the program
will be terminated.)

> In general, __predict_{true,false} should be left in the tool box and
> only pulled out after detailed dives into code paths involved.  For
> all my banging on the project, I think I've used them in *two* places.

Passing thought: it'd be nice if the names were shorter. I feel like
much of the reason they're seen as cumbersome is that they're so
verbose. Linux uses likely() and unlikely(). __rare() could also work.

FWIW, below is what /usr/include/sys/cdefs.h says about them. Maybe
outdated.


/*
 * GNU C version 2.96 adds explicit branch prediction so that
 * the CPU back-end can hint the processor and also so that
 * code blocks can be reordered such that the predicted path
 * sees a more linear flow, thus improving cache behavior, etc.
 *
 * The following two macros provide us with a way to utilize this
 * compiler feature.  Use __predict_true() if you expect the expression
 * to evaluate to true, and __predict_false() if you expect the
 * expression to evaluate to false.
 *
 * A few notes about usage:
 *
 *  * Generally, __predict_false() error condition checks (unless
 *you have some _strong_ reason to do otherwise, in which case
 *document it), and/or __predict_true() `no-error' condition
 *checks, assuming you want to optimize for the no-error case.
 *
 *  * Other than that, if you don't know the likelihood of a test
 *succeeding from empirical or other `hard' evidence, don't
 *make predictions.
 *
 *  * These are meant to be used in places that are run `a lot'.
 *It is wasteful to make predictions in code that is run
 *seldomly (e.g. at subsystem initialization time) as the
 *basic block reordering that this affects can often generate
 *larger code.
 */



Re: __predict_false for pledge

2015-10-26 Thread Theo de Raadt
> Not sure how people feel about these annotations. This is a pretty
> classic use case, though.

No, the classic case is when the condition is a single variable, rather
than a condition "always true && rarely true".



Re: __predict_false for pledge

2015-10-26 Thread Ted Unangst
Michael McConville wrote:
> Ted Unangst wrote:
> > Michael McConville wrote:
> > > We have a pretty strong guarantee that it can only happen once per
> > > process...
> > 
> > I don't think this really matters. What does it do to the assmembly,
> > and how does that make things faster?
> 
> It lets the compiler know that the body is very unlikely to run so that
> it won't unroll loops, and will maybe bump the condition body to the end
> of the procedure, etc. It can also be used to annotate the branch with a
> hint instruction, but I don't know how many architectures still use
> those.

I meant in this case specifically. What is the *demonstrated* benefit?

Generally, not many fans of the annotation for the sake of annontation in
these parts. :)