On Tue, May 10, 2022 at 09:16:21PM +1000, Nicholas Piggin wrote: > Excerpts from Ricardo Neri's message of May 6, 2022 10:00 am: > > The HPET hardlockup detector relies on tsc_khz to estimate the value of > > that the TSC will have when its HPET channel fires. A refined tsc_khz > > helps to estimate better the expected TSC value. > > > > Using the early value of tsc_khz may lead to a large error in the expected > > TSC value. Restarting the NMI watchdog detector has the effect of kicking > > its HPET channel and make use of the refined tsc_khz. > > > > When the HPET hardlockup is not in use, restarting the NMI watchdog is > > a noop. > > > > Cc: Andi Kleen <a...@linux.intel.com> > > Cc: Stephane Eranian <eran...@google.com> > > Cc: "Ravi V. Shankar" <ravi.v.shan...@intel.com> > > Cc: io...@lists.linux-foundation.org > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: x...@kernel.org > > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com> > > --- > > Changes since v5: > > * Introduced this patch > > > > Changes since v4 > > * N/A > > > > Changes since v3 > > * N/A > > > > Changes since v2: > > * N/A > > > > Changes since v1: > > * N/A > > --- > > arch/x86/kernel/tsc.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > > index cafacb2e58cc..cc1843044d88 100644 > > --- a/arch/x86/kernel/tsc.c > > +++ b/arch/x86/kernel/tsc.c > > @@ -1386,6 +1386,12 @@ static void tsc_refine_calibration_work(struct > > work_struct *work) > > /* Inform the TSC deadline clockevent devices about the recalibration */ > > lapic_update_tsc_freq(); > > > > + /* > > + * If in use, the HPET hardlockup detector relies on tsc_khz. > > + * Reconfigure it to make use of the refined tsc_khz. > > + */ > > + lockup_detector_reconfigure(); > > I don't know if the API is conceptually good. > > You change something that the lockup detector is currently using, > *while* the detector is running asynchronously, and then reconfigure > it.
Yes, this is what happens. > What happens in the window? If this code is only used for small > adjustments maybe it does not really matter Please see my comment > but in principle it's a bad API to export. > > lockup_detector_reconfigure as an internal API is okay because it > reconfigures things while the watchdog is stopped I see. > [actually that looks untrue for soft dog which uses watchdog_thresh in > is_softlockup(), but that should be fixed]. Perhaps there should be a watchdog_thresh_user. When the user updates it, the detector is stopped, watchdog_thresh is updated, and then the detector is resumed. > > You're the arch so you're allowed to stop the watchdog and configure > it, e.g., hardlockup_detector_perf_stop() is called in arch/. I had it like this but it did not look right to me. You are right, however, I can stop/restart the watchdog when needed. Thanks and BR, Ricardo