On 04/12/2025 09:21, Jinjie Ruan wrote: > After switch arm64 to Generic Entry, the compiler no longer inlines
Did it inline it before this series? > el0_svc_common() into do_el0_svc(). So inline el0_svc_common() and it > has 1% performance uplift on perf bench basic syscall on kunpeng920 > as below. > > | Metric | W/O this patch | With this patch | Change | > | ---------- | -------------- | --------------- | --------- | > | Total time | 2.195 [sec] | 2.171 [sec] | ↓1.1% | > | usecs/op | 0.219575 | 0.217192 | ↓1.1% | > | ops/sec | 4,554,260 | 4,604,225 | ↑1.1% | > > Signed-off-by: Jinjie Ruan <[email protected]> I think this is sensible - do_el0_svc() is clearly hot and the small increase in code size is completely justified. It also removes a performance regression when enabling CONFIG_COMPAT (without it el0_svc_common() has only one caller so it should be inlined regardless). Reviewed-by: Kevin Brodsky <[email protected]> > --- > arch/arm64/kernel/syscall.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c > index 47e193a1cfff..5aa51da9ec25 100644 > --- a/arch/arm64/kernel/syscall.c > +++ b/arch/arm64/kernel/syscall.c > @@ -66,8 +66,8 @@ static void invoke_syscall(struct pt_regs *regs, unsigned > int scno, > choose_random_kstack_offset(get_random_u16()); > } > > -static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr, > - const syscall_fn_t syscall_table[]) > +static __always_inline void el0_svc_common(struct pt_regs *regs, int scno, > int sc_nr, > + const syscall_fn_t syscall_table[]) > { > unsigned long work = READ_ONCE(current_thread_info()->syscall_work); > unsigned long flags = read_thread_flags();

