On arm64, HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is currently selected
only when DYNAMIC_FTRACE_WITH_CALL_OPS is available. CALL_OPS, in
turn, is mutually exclusive with kCFI: the pre-function NOPs it needs
would change the offset of the pre-function type hash (see
baaf553d3bc3 ("arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")),
and the compiler support needed to reconcile the two does not exist
yet.The result is that a CONFIG_CFI=y arm64 kernel has no ftrace direct calls at all, so register_fentry() fails with -ENOTSUPP and no BPF trampoline can attach: fentry/fexit, fmod_ret and BPF LSM programs are all unavailable. Deployments that want both kCFI hardening and BPF-based security monitoring currently have to give one of them up. systemd's bpf-restrict-fs feature hits this today: https://lore.kernel.org/all/20250610232418.GA3544567@ax162/ CALL_OPS is an optimization for direct calls, not a dependency. In-BL-range trampolines are reached by a direct branch without consulting the ops pointer, and out-of-range trampolines already fall back to ftrace_caller, where the DIRECT_CALLS machinery (call_direct_funcs() storing the trampoline in ftrace_regs, the ftrace_caller tail-call) is gated on DIRECT_CALLS alone. s390 and loongarch ship HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS this way, without having CALL_OPS at all. Patch 1 prepares ftrace_modify_call() to build without CALL_OPS by widening its #ifdef and using the existing ftrace_rec_update_ops() wrapper (no functional change for current configurations). Patch 2 drops the CALL_OPS requirement from the DIRECT_CALLS select. Configurations that keep CALL_OPS (clang !CFI, and GCC without CC_OPTIMIZE_FOR_SIZE) are unchanged. We verified this: in an arm64 clang build, every object file is byte-identical before and after the series except ftrace.o itself, and its disassembly is identical. CFI builds (and GCC -Os builds) gain working direct calls, with out-of-range attachments taking the ftrace_caller dispatch path instead of the per-callsite fast path. We tested on a 6.18.y-based kernel and on this base with clang kCFI builds (CONFIG_CFI=y, enforcing) under qemu (TCG, and KVM on an arm64 host) and on GB200-based arm64 hardware: fentry/fexit, fmod_ret and BPF LSM programs load, attach and execute; the ftrace-direct sample modules (including both modify samples, exercising ftrace_modify_call()) run cleanly; no CFI violations observed. The fentry_test, fexit_test, fentry_fexit, fexit_sleep, fexit_stress, modify_return, tracing_struct, lsm and trampoline_count selftests and the ftrace direct-call selftests (test.d/direct) pass on the new configuration with results identical to a CALL_OPS kernel built from the same tree, and a broader test_progs sweep showed no differences attributable to this series. Without the series, all of the above fail at attach time with -ENOTSUPP. riscv has the same gap (its DIRECT_CALLS select also requires CALL_OPS, and its CALL_OPS is likewise !CFI); if this approach is acceptable for arm64 we can follow up there. --- Jose Fernandez (Anthropic) (2): arm64: ftrace: prepare ftrace_modify_call() for use without CALL_OPS arm64: ftrace: allow DIRECT_CALLS without CALL_OPS arch/arm64/Kconfig | 2 +- arch/arm64/kernel/ftrace.c | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) --- base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731 change-id: 20260607-arm64-ftrace-direct-calls-152230ef7077 Best regards, -- Jose Fernandez (Anthropic) <[email protected]>
