Switch the free-node pop paths to raw_spin_trylock*() so callers don't block
on contended LRU locks. This is a narrower change than Menglong's approach [1],
which aimed to eliminate the deadlock entirely.

The trylock-based approach avoids deadlocks in long-lived critical
sections, while still allowing locking in short-lived ones. Although it
does not completely eliminate the possibility of deadlock, it
significantly reduces the likelihood in practice.

LRU-related deadlocks have been observed multiple times, including:

 - [syzbot] [bpf?] possible deadlock in bpf_lru_push_free (2) [2]
 - Re: [PATCH bpf v3 0/4] bpf: Free special fields when update hash and local 
storage maps [3]
 - Raw log of CI failure [4]

BTW, this series also factors out the bpf_lru_node_set_hash() helper, along with
a comment describing the required ordering and locking constraints.

Links:
[1] https://lore.kernel.org/bpf/[email protected]/
[2] https://lore.kernel.org/bpf/[email protected]/
[3] 
https://lore.kernel.org/bpf/CAEf4BzbTJCUx0D=zjx6+5m5iighwlzap94hnw36zmdhaf4-...@mail.gmail.com/
[4] 
https://github.com/kernel-patches/bpf/actions/runs/20943173932/job/60181505085

Leon Hwang (3):
  bpf: Factor out bpf_lru_node_set_hash() helper
  bpf: Avoid deadlock using trylock when popping LRU free nodes
  selftests/bpf: Allow -ENOMEM on LRU map updates

 kernel/bpf/bpf_lru_list.c                     | 35 ++++++++++++++-----
 .../bpf/map_tests/map_percpu_stats.c          |  3 +-
 2 files changed, 28 insertions(+), 10 deletions(-)

--
2.52.0


Reply via email to