On arm64, HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is currently selected
only when DYNAMIC_FTRACE_WITH_CALL_OPS is available. CALL_OPS, in
turn, is mutually exclusive with kCFI: the pre-function NOPs it needs
would change the offset of the pre-function type hash (see
baaf553d3bc3 ("arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")),
and the compiler support needed to reconcile the two does not exist
yet.

The result is that a CONFIG_CFI=y arm64 kernel has no
ftrace direct calls at all, so register_fentry() fails with -ENOTSUPP
and no BPF trampoline can attach: fentry/fexit, fmod_ret and BPF LSM
programs are all unavailable. Deployments that want both kCFI
hardening and BPF-based security monitoring currently have to give
one of them up. systemd's bpf-restrict-fs feature hits this today:
https://lore.kernel.org/all/20250610232418.GA3544567@ax162/

CALL_OPS is an optimization for direct calls, not a dependency.
In-BL-range trampolines are reached by a direct branch without
consulting the ops pointer, and out-of-range trampolines already
fall back to ftrace_caller, where the DIRECT_CALLS machinery
(call_direct_funcs() storing the trampoline in ftrace_regs, the
ftrace_caller tail-call) is gated on DIRECT_CALLS alone. s390 and
loongarch ship HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS this way,
without having CALL_OPS at all.

Patch 1 prepares ftrace_modify_call() to build without CALL_OPS by
widening its #ifdef and using the existing ftrace_rec_update_ops()
wrapper (no functional change for current configurations). Patch 2
drops the CALL_OPS requirement from the DIRECT_CALLS select.

Configurations that keep CALL_OPS (clang !CFI, and GCC without
CC_OPTIMIZE_FOR_SIZE) are unchanged. We verified this: in an arm64
clang build, every object file is byte-identical before and after
the series except ftrace.o itself, and its disassembly is identical.
CFI builds (and GCC -Os builds) gain working direct calls, with
out-of-range attachments taking the ftrace_caller dispatch path
instead of the per-callsite fast path.

We tested on a 6.18.y-based kernel and on this base with clang
kCFI builds (CONFIG_CFI=y, enforcing) under qemu (TCG, and KVM on an
arm64 host) and on GB200-based arm64 hardware: fentry/fexit, fmod_ret
and BPF LSM programs load, attach and execute; the ftrace-direct
sample modules (including both modify samples, exercising
ftrace_modify_call()) run cleanly; no CFI violations observed. The
fentry_test, fexit_test, fentry_fexit, fexit_sleep, fexit_stress,
modify_return, tracing_struct, lsm and trampoline_count selftests and
the ftrace direct-call selftests (test.d/direct) pass on the new
configuration with results identical to a CALL_OPS kernel built from
the same tree, and a broader test_progs sweep showed no differences
attributable to this series. Without the series, all of the above
fail at attach time with -ENOTSUPP.

riscv has the same gap (its DIRECT_CALLS select also requires
CALL_OPS, and its CALL_OPS is likewise !CFI); if this approach is
acceptable for arm64 we can follow up there.

---
Jose Fernandez (Anthropic) (2):
      arm64: ftrace: prepare ftrace_modify_call() for use without CALL_OPS
      arm64: ftrace: allow DIRECT_CALLS without CALL_OPS

 arch/arm64/Kconfig         | 2 +-
 arch/arm64/kernel/ftrace.c | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)
---
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
change-id: 20260607-arm64-ftrace-direct-calls-152230ef7077

Best regards,
--  
Jose Fernandez (Anthropic) <[email protected]>


Reply via email to