On 2025-12-18 19:25, Boqun Feng wrote:
[...]
And if we pre-populate the 8 slots for each cpu, and thus force
fallback to overflow list:
hazptr-smp-mb-8-fail 67.1 ns
Thank you! So involving locking seems to hurt performance more than
per-CPU/per-task operations. This may suggest that enabling
PREEMPT_HAZPTR by default has an acceptable performance.
Indeed, I can fold it into the hazptr patch and remove the config
option then.
So breaking up the iteration into pieces is not just to handle
busy-waiting, but also to make sure we don't increase the
system latency by holding a raw spinlock (taken with rq lock
held) for more than the little time needed to iterate to the next
node.
I agree that it helps reduce the latency, but I feel like with a scan
thread in the picture (and we don't need to busy-wait), we should use
a forward-progress-guaranteed way in the updater side scan, which means
we may need to explore other solutions for the latency (e.g.
fine-grained locking hashlist for the overflow list) than the generation
counter.
Guaranteeing forward progress of synchronize even with a steady stream
of unrelated hazard pointers addition/removal to/from the overflow list
is something we should aim for, with or without a scan thread.
As is, my current generation scheme does not guarantee this. But we can
use liburcu RCU grace period "parity" concept as inspiration [1] and
introduce a two-lists scheme, and have hazptr_synchronize flip the
current "list_add" head while it iterates on the other list. There would
be one generation counter for each of the two lists.
This would be protected by holding a global mutex across
hazptr_synchronize. hazptr_synchronize would need to iterate
on the two lists one after the other, carefully flipping the
current "addition list" head between the two iterations.
So the worse case that can happen in terms of retry caused by
generation counter increments is if list entries are deleted while
the list is being traversed by hazptr_synchronize. Because there
are no possible concurrent additions to that list, the worse case
is that the list becomes empty, which bounds the number of retry
to the number of list elements.
Thoughts ?
Thanks,
Mathieu
[1] https://github.com/urcu/userspace-rcu/blob/master/src/urcu.c#L409
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com