On Tue, Aug 25, 2020 at 12:12:36PM -0700, Mike Larkin wrote: > On Mon, Aug 24, 2020 at 01:55:45AM +0200, Mark Kettenis wrote: > > > Date: Sun, 23 Aug 2020 18:11:12 -0500 > > > From: Scott Cheloha <[email protected]> > > > > > > Hi, > > > > > > Other BSDs use the TSC to implement delay(9) if the TSC is constant > > > and invariant. Here's a patch to add something similar to our kernel. > > > > If the TSC is fine as a timecounter it should be absolutely fine for > > use as delay(). And we could even use if the TSC isn't synchronized > > between CPUs. > > > > > > > > This patch (or something equivalent) is a prerequisite to running the > > > lapic timer in oneshot or TSC deadline mode. Using the lapic timer to > > > implement delay(9) when it isn't running in periodic mode is too > > > complicated. However, using the i8254 for delay(9) is too slow. We > > > need an alternative. > > > > Hmm, but what are we going to use on machines where the TSC isn't > > constant/invariant? > > > > In what respect is the i8254 too slow? Does it take more than a > > microsecond to read it? > > > > It's 3 outb/inb pairs to ensure you get the reading correct. So that could > be quite a long time (as cheloha@ points out). Also, that's 6 VM exits if > running virtually (I realize that's not the main use case here but just > saying...) > > IIRC the 3 in/out pairs are the latch command followed by reading the LSB/MSB > of the counter. It's not MMIO like the HPET or ACPI timer. > > And as cheloha@ also points out, it is highly likely that none of us have a > real i8254 anymore, much of this is probably implemented in some EC somewhere > and it's unlikely the developer of said EC put a lot of effort into optimizing > the implementation of a legacy device like this. > > On the topic of virtualization: > > while (rdtsc() - start < want) > rdtsc(); >
I just realized the original diff didn't do two rdtscs. It did a pause inside the loop. So the effect is not *as* bad as I described but it's still *somewhat* bad. PS - pause loop exiting can be enabled to improve performance in this situation. > ..produces two VM exits (generally, on most hypervisors) since the TSC is > usually time corrected. That's a lot of exits, and it gets worse on faster > machines. I don't have a better idea, however. There may be a PV clock option > that is more optimized in some scenarios. > > -ml > > > > We could use the HPET I suppose, whic may be a bit better. > > > > > As for the patch, it works for me here, though I'd appreciate a few > > > tests. I admit that comparing function pointers is ugly, but I think > > > this is as simple as it can be without implementing some sort of > > > framework for "registering" delay(9) implementations and comparing > > > them and selecting the "best" implementation. > > > > What about: > > > > if (delay_func == NULL) > > delay_func = lapic_delay; > > > > > I'm not sure I put the prototypes in the right headers. We don't have > > > a tsc.h but cpuvar.h looks sorta-correct for tsc_delay(). > > > > I think cpuvar.h is fine since it has other TSC-related stuff. > > However, with my suggestion above you can drop that. > > > > > FreeBSD's x86/delay.c may be of note: > > > > > > https://github.com/freebsd/freebsd/blob/ed96335a07b688c39e16db8856232e5840bc22ac/sys/x86/x86/delay.c > > > > > > Thoughts? > > > > > > Index: amd64/tsc.c > > > =================================================================== > > > RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v > > > retrieving revision 1.20 > > > diff -u -p -r1.20 tsc.c > > > --- amd64/tsc.c 23 Aug 2020 21:38:47 -0000 1.20 > > > +++ amd64/tsc.c 23 Aug 2020 22:59:25 -0000 > > > @@ -26,6 +26,7 @@ > > > > > > #include <machine/cpu.h> > > > #include <machine/cpufunc.h> > > > +#include <machine/cpuvar.h> > > > > > > #define RECALIBRATE_MAX_RETRIES 5 > > > #define RECALIBRATE_SMI_THRESHOLD 50000 > > > @@ -252,7 +253,8 @@ tsc_timecounter_init(struct cpu_info *ci > > > tsc_timecounter.tc_quality = -1000; > > > tsc_timecounter.tc_user = 0; > > > tsc_is_invariant = 0; > > > - } > > > + } else > > > + delay_func = tsc_delay; > > > > > > tc_init(&tsc_timecounter); > > > } > > > @@ -342,4 +344,15 @@ tsc_sync_ap(struct cpu_info *ci) > > > { > > > tsc_post_ap(ci); > > > tsc_post_ap(ci); > > > +} > > > + > > > +void > > > +tsc_delay(int usecs) > > > +{ > > > + uint64_t interval, start; > > > + > > > + interval = (uint64_t)usecs * tsc_frequency / 1000000; > > > + start = rdtsc_lfence(); > > > + while (rdtsc_lfence() - start < interval) > > > + CPU_BUSY_CYCLE(); > > > } > > > Index: amd64/lapic.c > > > =================================================================== > > > RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v > > > retrieving revision 1.55 > > > diff -u -p -r1.55 lapic.c > > > --- amd64/lapic.c 3 Aug 2019 14:57:51 -0000 1.55 > > > +++ amd64/lapic.c 23 Aug 2020 22:59:25 -0000 > > > @@ -41,6 +41,7 @@ > > > #include <machine/codepatch.h> > > > #include <machine/cpu.h> > > > #include <machine/cpufunc.h> > > > +#include <machine/cpuvar.h> > > > #include <machine/pmap.h> > > > #include <machine/mpbiosvar.h> > > > #include <machine/specialreg.h> > > > @@ -569,7 +570,8 @@ skip_calibration: > > > * Now that the timer's calibrated, use the apic timer routines > > > * for all our timing needs.. > > > */ > > > - delay_func = lapic_delay; > > > + if (delay_func != tsc_delay) > > > + delay_func = lapic_delay; > > > initclock_func = lapic_initclocks; > > > } > > > } > > > Index: include/cpuvar.h > > > =================================================================== > > > RCS file: /cvs/src/sys/arch/amd64/include/cpuvar.h,v > > > retrieving revision 1.10 > > > diff -u -p -r1.10 cpuvar.h > > > --- include/cpuvar.h 9 Aug 2019 15:20:05 -0000 1.10 > > > +++ include/cpuvar.h 23 Aug 2020 22:59:25 -0000 > > > @@ -102,4 +102,6 @@ void tsc_sync_drift(int64_t); > > > void tsc_sync_bp(struct cpu_info *); > > > void tsc_sync_ap(struct cpu_info *); > > > > > > +void tsc_delay(int); > > > + > > > #endif > > > Index: include/i82489var.h > > > =================================================================== > > > RCS file: /cvs/src/sys/arch/amd64/include/i82489var.h,v > > > retrieving revision 1.18 > > > diff -u -p -r1.18 i82489var.h > > > --- include/i82489var.h 4 Oct 2018 05:00:40 -0000 1.18 > > > +++ include/i82489var.h 23 Aug 2020 22:59:26 -0000 > > > @@ -128,4 +128,6 @@ extern void lapic_calibrate_timer(struct > > > extern void lapic_startclock(void); > > > extern void lapic_initclocks(void); > > > > > > +extern void lapic_delay(int); > > > + > > > #endif > > > > > > > > >
