arm64 gained ftrace direct calls in commit 2aa6ac03516d ("arm64:
ftrace: Add direct call support") on top of
DYNAMIC_FTRACE_WITH_CALL_OPS, using the per-callsite ops pointer as a
fast path to reach the direct trampoline. Since commit baaf553d3bc3
("arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS"), CALL_OPS is
mutually exclusive with CFI: the pre-function NOPs would change the
offset of the pre-function kCFI type hash, and the compiler support
needed to keep that offset consistent does not exist yet.

The result is that a CONFIG_CFI=y kernel loses CALL_OPS, and with it
DIRECT_CALLS, and with it every BPF trampoline attachment to kernel
functions: register_fentry() returns -ENOTSUPP, so fentry/fexit,
fmod_ret and BPF LSM programs cannot attach at all. This is a real
problem for hardened arm64 deployments that rely on BPF LSM for
security monitoring while keeping kCFI enabled.

CALL_OPS is an optimization for direct calls, not a dependency. When
the direct trampoline is within BL range, the callsite branches
straight to it and ftrace_caller is not involved. When it is out of
range, ftrace_find_callable_addr() already falls back to
ftrace_caller, and the DIRECT_CALLS machinery there
(FREGS_DIRECT_TRAMP, ftrace_caller_direct_late) is gated on
DIRECT_CALLS alone, not CALL_OPS: the ops dispatch invokes
call_direct_funcs(), which stores the trampoline address in
ftrace_regs, and ftrace_caller tail-calls it. s390 and loongarch use
this same mechanism for HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS without
having CALL_OPS at all, and DYNAMIC_FTRACE_WITH_ARGS without CALL_OPS
is already a supported arm64 configuration (GCC builds with
CC_OPTIMIZE_FOR_SIZE do not satisfy the CALL_OPS select condition).

Drop the CALL_OPS requirement from the
HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS select. Configurations that
keep CALL_OPS (!CFI clang builds, and GCC builds without
CC_OPTIMIZE_FOR_SIZE) are unchanged. CALL_OPS-less configurations
take the ftrace_caller ops-dispatch path for out-of-range direct
calls, trading the per-callsite fast path for working BPF
trampolines; in-range attachments still branch directly with no
overhead. GCC -Os builds also gain DIRECT_CALLS as a side effect.
That is intended: s390 and loongarch already ship DIRECT_CALLS
without any per-callsite fast path.

Assisted-by: Claude:unspecified
Signed-off-by: Jose Fernandez (Anthropic) <[email protected]>
---
 arch/arm64/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943b..2cd7d536671c9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -188,7 +188,7 @@ config ARM64
                if (GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS || \
                    CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS)
        select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
-               if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
+               if DYNAMIC_FTRACE_WITH_ARGS
        select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
                if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
                    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))

-- 
2.52.0


Reply via email to