> On Jan 1, 2026, at 10:41 PM, Paul E. McKenney <[email protected]> wrote:
> 
> On Thu, Jan 01, 2026 at 09:59:27PM -0500, Joel Fernandes wrote:
>> 
>> 
>>> On 1/1/2026 5:24 PM, Paul E. McKenney wrote:
>>> On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote:
>>>> On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote:
>>>>> On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote:
>>>>>> Hi Paul,
>>>>>> 
>>>>>> On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote:
>>>>>>> On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote:
>>>>>>>> During studying some synchronize_rcu() latencies, I found that the
>>>>>>>> jiffies_till_first_fqs value passed to the timer tick subsystem does 
>>>>>>>> is always
>>>>>>>> off by one. This is natural due to calc_index() rounding up.
>>>>>>>> 
>>>>>>>> For example, jiffies_till_first_fqs=3 means the "Jiffies till first 
>>>>>>>> FQS" delay
>>>>>>>> is actually 4ms. And same for the next FQS. In fact, in testing it 
>>>>>>>> shows it can
>>>>>>>> never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms 
>>>>>>>> probably due
>>>>>>>> to interrupts.
>>>>>>>> 
>>>>>>>> Considering this, I think it is better to reduce the 
>>>>>>>> jiffies_till_first_fqs by 1
>>>>>>>> before passing it to the wait APIs.
>>>>>>>> 
>>>>>>>> But before I wanted to send a patch, I wanted to get everyone's 
>>>>>>>> thoughts.
>>>>>>>> Considering this the RFC.
>>>>>>> 
>>>>>>> Inadvertent passing of the value zero?
>>>>>> 
>>>>>> This should not be an issue because at the moment, even a value of
>>>>>> jiffies_till_first_fqs == 0 waits for ~1 jiffie due to 
>>>>>> schedule_timeout(0).
>>>>>> 
>>>>>> But you raise a good point, we should cap the minimum allowed jiffie 
>>>>>> value
>>>>>> for the fqs parameters to 1 so that we don't pass schedule_timeout() with
>>>>>> negative values when/if we do the reduce-by-one approach.
>>>>> 
>>>>> There is a potential use case for jiffies_till_first_fqs=0 and no wait,
>>>>> which would be systems that want to scan for idle CPUs immediately after
>>>>> the grace period has been initialized.  Note the word "potential".  ;-)
>>>> 
>>>> Sure, we could add support for that but that would be new behavior that is
>>>> not in the existing code.
>>>> 
>>>> So jiffies_till_first_fqs=0 today, I think it is not 'working as intended'
>>>> because it will never not wait I think.
>>> 
>>> Agreed.
>>>>> So we should fix that too? Or maybe it can be a patch separate from this
>>>> (that I can work on). I think no harming in allowing that mode, at least it
>>>> will be more in line with the expected outcome.
>>> 
>>> Makes sense!  However, given that no one has complained, care is required.
>>> Someone might be relying on the old behavior.  (In which case an easy
>>> fix would be to make -1 be no waiting, though one might hope for a
>>> better fix.)
>> Some further investigations revealed that the "1 jiffie error" is actually 
>> worst
>> case. In the best case, it could still be closer to a jiffie. It is just the
>> nature of the timer wheel, since it snaps to numerical TICK_NS boundary, the
>> rounding error is intentionally added depending on how far along in the 
>> boundary
>> was the timer for the wait enqueued. If we took probability distributions, we
>> should be landing with a 1/2 jiffie error, though in practice I've seen it 
>> to be
>> 3/4 jiffie error on average.
>> 
>> Given this, it would probably not make sense for us to do the -1 to adjust 
>> for
>> the error (since we don't clearly have bounds on the minimum error). We just
>> have to accept that we'd lose 1-2 extra jiffie per FQS loop iteration wait,
>> which is amplified if a grace period is already in progress. I've seen this 
>> add
>> upto 4 jiffies to back-to-back synchronize_rcu() latency even when there are 
>> no
>> readers in progress.
> .
>> But I had to go down the rabbit hole and check... ;-)
> 
> I was thinking in terms of special-casing -1 to skip the sleep, but I
> guess that there are as many ways to skin a rabbit as a cat.  ;-)

Sure I am happy to do that. One of my fears though is no one will know to use 
it that way making it not that useful.

Do let me know if anyone sets it to 0 though. Perhaps for testing even to make 
the GP cycle shorter?

 - Joel


> 
>                            Thanx, Paul
> 
> 

Reply via email to