> On 14 Dec 2023, at 00:45, Paul Moore <[email protected]> wrote:
>
> On Tue, Dec 12, 2023 at 5:29 AM Håkon Bugge <[email protected]> wrote:
>>
>> For the most time-consuming function, when running a syscall benchmark
>> with STIG compliant audit rules:
>>
>> Overhead Command Shared Object Symbol
>> ......... ............ ................. ........................
>>
>> 27.62% syscall_lat [kernel.kallsyms] [k] __audit_filter_op
>>
>> we apply codegen optimizations, which speeds up the syscall
>> performance by around 17% on an Intel Cascade Lake system.
>>
>> We run "perf stat -d -r 5 ./syscall_lat", where syscall_lat is a C
>> application that measures average syscall latency from getpid()
>> running 100 million rounds.
>>
>> Between each perf run, we reboot the system and waits until the last
>> minute load is less than 1.0.
>>
>> We boot the kernel, v6.6-rc4, with "mitigations=off", in order to
>> amplify the changes in the audit system.
>>
>> Let the base kernel be v6.6-rc4 with booted with "audit=1" and
>> "mitigations=off" and with the commit "audit: Vary struct audit_entry
>> alignment" on an Intel Cascade Lake system. The following three
>> metrics are reported, nanoseconds per syscall, L1D misses per syscall,
>> and finally Intructions Per Cycle, ipc.
>>
>> Base vs. base + this commit gives:
>>
>> ns per call:
>> min avg max pstdev
>> - 203 203 209 0.954149
>> + 173 173 178 0.884534
>>
>> L1d misses per syscall:
>> min avg max pstdev
>> - 0.012 0.103 0.817 0.238352
>> + 0.010 0.209 1.235 0.399416
>>
>> ipc:
>> min avg max pstdev
>> - 2.320 2.329 2.330 0.003000
>> + 2.430 2.436 2.440 0.004899
>>
>> Signed-off-by: Håkon Bugge <[email protected]>
>> ---
>> kernel/auditsc.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
>> index 6f0d6fb6523fa..84d0dfe75a4ac 100644
>> --- a/kernel/auditsc.c
>> +++ b/kernel/auditsc.c
>> @@ -822,6 +822,7 @@ static int audit_in_mask(const struct audit_krule *rule,
>> unsigned long val)
>> * parameter can be NULL, but all others must be specified.
>> * Returns 1/true if the filter finds a match, 0/false if none are found.
>> */
>> +#pragma GCC optimize("unswitch-loops", "align-loops=16", "align-jumps=16")
>
> The kernel doesn't really make use of #pragma optimization statements
> like this, at least not in any of the core areas, and I'm not
> interested in being the first to do so. I appreciate the time and
> effort that you have spent profiling the audit subsystem, but this
> isn't a patch I can accept at this point in time, I'm sorry.
Fair enough. Will a function attribute aka:
__attribute__((optimize("foo=bar")))
be acceptable for you?
Thxs, Håkon
>
>> static int __audit_filter_op(struct task_struct *tsk,
>> struct audit_context *ctx,
>> struct list_head *list,
>> @@ -841,6 +842,7 @@ static int __audit_filter_op(struct task_struct *tsk,
>> }
>> return 0;
>> }
>> +#pragma GCC reset_options
>>
>> /**
>> * audit_filter_uring - apply filters to an io_uring operation
>> --
>> 2.39.3
>
> --
> paul-moore.com
>