Re: Handling large numbers of hazard pointers

Mathieu Desnoyers Sat, 13 Dec 2025 18:38:14 -0800

On 2025-12-13 17:31, Paul E. McKenney wrote:

Hello!


I didn't do a good job of answering the "what about large numbers of
hazard pointers" at Boqun's and my hazard-pointers talk at Linux Plumbers
Conference yesterday, so please allow me to at least start on the path
towards fixing that problem.

Also, there were a couple of people participating whose email addresses
I don't know, so please feel free to CC them.

The trick is that in real workloads to date, although there might be
millions of hazard pointers, there will typically only be a few active
per CPU at a given time.  This of course suggests a per-CPU data structure
tracking the active ones.  Allocating a hazard pointer grabs an unused one
from this array, or, if all entries are in use, takes memory provided by
the caller and links it into an overflow list.  Either way, it returns a
pointer to the hazard pointer that is now visible to updaters.  When done,
the caller calls a function that marks the array-entry as unused or
removes the element from the list, as the case may be.  Because hazard
pointers can migrate among CPUs, full synchronization is required when
operating on the array and the overflow list.

And either way, the caller is responsible for allocating and freeing the
backup hazard-pointer structure that will be used in case of overflow.
And also either way, the updater need only deal with hazard pointers
that are currently in use.

OK, so let me state how I see the fundamental problem you are trying
to address, and detail a possible solution.

* Problem Statement

Assuming we go for an array of hazard pointer slots per CPU to cover
the fast path (common case), we still need to handle the
overflow scenario, where more hazard pointers are accessed
concurrently for a given CPU than the array size, either due to
preemption, nested interrupts, or simply nested calls.

* Possible Solution

Requiring the HP caller to allocate backup space is clearly something
that would cover all scenarios. My worry is that tracking this backup
space allocation may be cumbersome for the user, especially if this
requires heap allocation.

Where the backup space can be allocated will likely depend on how long
the HP will be accessed. My working hypothesis here (let me know if
I'm wrong) is that a most of those HP users will complete their access
within the same stack frame where the HP was acquired. This is the
primary use-case I would like to make sure is convenient.

For that use-case the users can simply allocate room on their
stack frame for the backup HP slot. The requirement here is that they
clear the HP slot before the end of the current stack frame.
If there is enough room in the per-CPU array, they use that, else
they add the backup slot from their stack into the backup slot
list. When they are done, if they used a backup slot, they need
to remove it from the list.

There could still be room for more advanced use-cases where the
backup slots are allocated on the heap, but I suspect that it would
make the API trickier to use and should be reserved for use-cases
that really require it.

Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: Handling large numbers of hazard pointers

Reply via email to