On 12/19/2025 5:19 PM, Mathieu Desnoyers wrote:
> On 2025-12-19 10:42, Joel Fernandes wrote:
>>
>> IMHO the overflow case is "special" and should not happen often, otherwise
>> things are "bad" anyway. I am not sure if this kind of complexity will be 
>> worth
>> it unless we know HP forward-progress is a real problem. Also, since HP 
>> acquire
>> will be short lived, are we that likely to not get past a temporary shortage 
>> of
>> slots?
> 
> Given that we have context switch integration which moves the active
> per-cpu slots to the overflow list, I can see that even not-so-special
> users which get preempted or block regularly with active per-cpu slots
> could theoretically end up preventing HP scan progress.

Yeah, I see. That's the 'problem' with the preemption design of this patchset.
You always have to move it in and out of overflow list on preemption even when
there is no overflow AFAICS. My idea (which has its own issues) does not require
that on preemption.

> Providing HP scan progress guarantees is IMO an important aspect to
> consider if we want to ensure the implementation does not cause subtle
> HP scan stall when its amount of use will scale up.

Sure. But if you didn't have to deal with a list in the 'normal' case (not over
saturated slots case), then it wouldn't be that big a deal.

>> Perhaps the forward-progress problem should be rephrased to the following?: 
>> If a
>> reader hit an overflow slot, it should probably be able to get a non-overflow
>> slot soon, even if hazard pointer slots are over-subscribed.
> 
> Those are two distinct forward progress guarantees, and I think both are
> important:
> 
> * Forward progress of HP readers,
> * Forward progress of HP scan.

Maybe I am missing something, but AFAICS, if the readers and only using slots
and not locking in normal operation, then the scan also will automatically make
forward progress. So both are forward progress of readers and scanning related.
It is your preemption design that requires the locking..

Btw, I thought about the scanning issue with the task-slots idea, and really
synchronize() is supposed to be a slow-path so I am not fully sure it is that
much of an issue - again depends on usecase but for per-cpu ref and
synchronize_rcu(), both are 'slow' anyway. Again depends on usecase. And for the
async case, it is almost not an issue at all due to the batching/amortization of
the task scan cost.

thanks,

 - Joel


Reply via email to