On 2026/6/2 上午10:15, Chenguang Zhao wrote:
The BPF verifier can lower a bpf_kptr_xchg() call into a single BPF_XCHG
atomic instruction when the JIT advertises support through
bpf_jit_supports_ptr_xchg(). This drops the helper call overhead from the
kptr exchange fast path.
Such inlining is only safe when the JITed atomic exchange provides the
same full memory ordering as the bpf_kptr_xchg() helper. On LoongArch the
plain amswap.d instruction carries no barrier semantics, so emit the
ordered amswap_db.d variant for 64-bit BPF_XCHG instead. Add the
amswapdbw/amswapdbd instruction emit helpers it relies on, and implement
bpf_jit_supports_ptr_xchg() to turn the inlining on.
Extend the kptr_xchg_inline selftest to cover LoongArch64, and add a
kptr-xchg benchmark to compare the helper and inline paths.
If I understand correctly, please split into two parts, one for JIT
and one for selftest.
1. For the JIT part, there are two logical changes, it is better to
split into two patches:
(1) fix the current code.
(2) add new feature.
Regardless of whether bpf_jit_supports_ptr_xchg returns true or false,
amswap_db.b/h/w/d must be used. The LKMM (Linux Kernel Memory Model)
mandates that all value-returning atomic RMW instructions imply a full
barrier. Consequently, decoupling the JIT instruction selection from
the pointer-support state ensures that regular scalar atomic updates
remain sequentially consistent across multi-core systems.
2. For the selftest part, there are two logical changes, it is better
to split into two patches:
(1) enable the functional test "./test_progs -t kptr_xchg_inline".
(2) add benchmark for kptr-xchg performance.
Thanks,
Tiezhu