With sane audit rules, audit logging would only be triggered
infrequently. Keeping this in mind, annotate audit_in_mask() as
unlikely() to allow the compiler to pessimize the call to
audit_filter_rules().

This allows GCC to invert the branch direction for the audit_filter_rules()
basic block in this loop:

        list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) 
{
                if (audit_in_mask(&e->rule, major) &&
                    audit_filter_rules(tsk, &e->rule, ctx, NULL,
                                       &state, false)) {
                        ...

such that it executes the common case in a straight line fashion.

On a Skylakex system change in getpid() latency (all results
aggregated across 12 boot cycles):

         Min     Mean    Median   Max      pstdev
         (ns)    (ns)    (ns)     (ns)

 -    196.63   207.86  206.60  230.98      (+- 3.92%)
 +    173.11   182.51  179.65  202.09      (+- 4.34%)

Performance counter stats for 'bin/getpid' (3 runs) go from:
    cycles               805.58  (  +-  4.11% )
    instructions        1654.11  (  +-   .05% )
    IPC                    2.06  (  +-  3.39% )
    branches             430.02  (  +-   .05% )
    branch-misses          1.55  (  +-  7.09% )
    L1-dcache-loads      440.01  (  +-   .09% )
    L1-dcache-load-misses  9.05  (  +- 74.03% )

 to:
    cycles               706.13  (  +-  4.13% )
    instructions        1654.70  (  +-   .06% )
    IPC                    2.35  (  +-  4.25% )
    branches             430.99  (  +-   .06% )
    branch-misses          0.50  (  +-  2.00% )
    L1-dcache-loads      440.02  (  +-   .07% )
    L1-dcache-load-misses  5.22  (  +- 82.75% )

(Both aggregated over 12 boot cycles.)

cycles: performance improves on average by ~100 cycles/call. IPC
improves commensurately. Two reasons for this improvement:

  * one fewer branch mispred: no obvious reason for this
    branch-miss reduction. There is no significant change in
    basic-block structure (apart from the branch inversion.)

  * the direction of the branch for the call is now inverted, so it
    chooses the not-taken direction more often. The issue-latency
    for not-taken branches is often cheaper.

Signed-off-by: Ankur Arora <ankur.a.ar...@oracle.com>
---
 kernel/auditsc.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 533b087c3c02..bf26f47b5226 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -789,7 +789,7 @@ static enum audit_state audit_filter_task(struct 
task_struct *tsk, char **key)
        return AUDIT_STATE_BUILD;
 }
 
-static int audit_in_mask(const struct audit_krule *rule, unsigned long val)
+static bool audit_in_mask(const struct audit_krule *rule, unsigned long val)
 {
        int word, bit;
 
@@ -850,12 +850,13 @@ static void audit_filter_syscall(struct task_struct *tsk,
 
        rcu_read_lock();
        list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) 
{
-               if (audit_in_mask(&e->rule, major) &&
-                   audit_filter_rules(tsk, &e->rule, ctx, NULL,
-                                      &state, false)) {
-                       rcu_read_unlock();
-                       ctx->current_state = state;
-                       return;
+               if (unlikely(audit_in_mask(&e->rule, major))) {
+                       if (audit_filter_rules(tsk, &e->rule, ctx, NULL,
+                                              &state, false)) {
+                               rcu_read_unlock();
+                               ctx->current_state = state;
+                               return;
+                       }
                }
        }
        rcu_read_unlock();
-- 
2.31.1

--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit

Reply via email to