On Tue, Aug 25, 2020 at 12:12:36PM -0700, Mike Larkin wrote:
> On Mon, Aug 24, 2020 at 01:55:45AM +0200, Mark Kettenis wrote:
> > > Date: Sun, 23 Aug 2020 18:11:12 -0500
> > > From: Scott Cheloha <[email protected]>
> > >
> > > Hi,
> > >
> > > Other BSDs use the TSC to implement delay(9) if the TSC is constant
> > > and invariant.  Here's a patch to add something similar to our kernel.
> >
> > If the TSC is fine as a timecounter it should be absolutely fine for
> > use as delay().  And we could even use if the TSC isn't synchronized
> > between CPUs.
> >
> > >
> > > This patch (or something equivalent) is a prerequisite to running the
> > > lapic timer in oneshot or TSC deadline mode.  Using the lapic timer to
> > > implement delay(9) when it isn't running in periodic mode is too
> > > complicated.  However, using the i8254 for delay(9) is too slow.  We
> > > need an alternative.
> >
> > Hmm, but what are we going to use on machines where the TSC isn't
> > constant/invariant?
> >
> > In what respect is the i8254 too slow?  Does it take more than a
> > microsecond to read it?
> >
>
> It's 3 outb/inb pairs to ensure you get the reading correct. So that could
> be quite a long time (as cheloha@ points out). Also, that's 6 VM exits if
> running virtually (I realize that's not the main use case here but just
> saying...)
>
> IIRC the 3 in/out pairs are the latch command followed by reading the LSB/MSB
> of the counter. It's not MMIO like the HPET or ACPI timer.
>
> And as cheloha@ also points out, it is highly likely that none of us have a
> real i8254 anymore, much of this is probably implemented in some EC somewhere
> and it's unlikely the developer of said EC put a lot of effort into optimizing
> the implementation of a legacy device like this.
>
> On the topic of virtualization:
>
> while (rdtsc() - start < want)
>  rdtsc();
>

I just realized the original diff didn't do two rdtscs. It did a pause inside 
the
loop. So the effect is not *as* bad as I described but it's still *somewhat* 
bad.

PS - pause loop exiting can be enabled to improve performance in this situation.

> ..produces two VM exits (generally, on most hypervisors) since the TSC is
> usually time corrected. That's a lot of exits, and it gets worse on faster
> machines. I don't have a better idea, however. There may be a PV clock option
> that is more optimized in some scenarios.
>
> -ml
>
>
> > We could use the HPET I suppose, whic may be a bit better.
> >
> > > As for the patch, it works for me here, though I'd appreciate a few
> > > tests.  I admit that comparing function pointers is ugly, but I think
> > > this is as simple as it can be without implementing some sort of
> > > framework for "registering" delay(9) implementations and comparing
> > > them and selecting the "best" implementation.
> >
> > What about:
> >
> >             if (delay_func == NULL)
> >                     delay_func = lapic_delay;
> >
> > > I'm not sure I put the prototypes in the right headers.  We don't have
> > > a tsc.h but cpuvar.h looks sorta-correct for tsc_delay().
> >
> > I think cpuvar.h is fine since it has other TSC-related stuff.
> > However, with my suggestion above you can drop that.
> >
> > > FreeBSD's x86/delay.c may be of note:
> > >
> > > https://github.com/freebsd/freebsd/blob/ed96335a07b688c39e16db8856232e5840bc22ac/sys/x86/x86/delay.c
> > >
> > > Thoughts?
> > >
> > > Index: amd64/tsc.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> > > retrieving revision 1.20
> > > diff -u -p -r1.20 tsc.c
> > > --- amd64/tsc.c   23 Aug 2020 21:38:47 -0000      1.20
> > > +++ amd64/tsc.c   23 Aug 2020 22:59:25 -0000
> > > @@ -26,6 +26,7 @@
> > >
> > >  #include <machine/cpu.h>
> > >  #include <machine/cpufunc.h>
> > > +#include <machine/cpuvar.h>
> > >
> > >  #define RECALIBRATE_MAX_RETRIES          5
> > >  #define RECALIBRATE_SMI_THRESHOLD        50000
> > > @@ -252,7 +253,8 @@ tsc_timecounter_init(struct cpu_info *ci
> > >           tsc_timecounter.tc_quality = -1000;
> > >           tsc_timecounter.tc_user = 0;
> > >           tsc_is_invariant = 0;
> > > - }
> > > + } else
> > > +         delay_func = tsc_delay;
> > >
> > >   tc_init(&tsc_timecounter);
> > >  }
> > > @@ -342,4 +344,15 @@ tsc_sync_ap(struct cpu_info *ci)
> > >  {
> > >   tsc_post_ap(ci);
> > >   tsc_post_ap(ci);
> > > +}
> > > +
> > > +void
> > > +tsc_delay(int usecs)
> > > +{
> > > + uint64_t interval, start;
> > > +
> > > + interval = (uint64_t)usecs * tsc_frequency / 1000000;
> > > + start = rdtsc_lfence();
> > > + while (rdtsc_lfence() - start < interval)
> > > +         CPU_BUSY_CYCLE();
> > >  }
> > > Index: amd64/lapic.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
> > > retrieving revision 1.55
> > > diff -u -p -r1.55 lapic.c
> > > --- amd64/lapic.c 3 Aug 2019 14:57:51 -0000       1.55
> > > +++ amd64/lapic.c 23 Aug 2020 22:59:25 -0000
> > > @@ -41,6 +41,7 @@
> > >  #include <machine/codepatch.h>
> > >  #include <machine/cpu.h>
> > >  #include <machine/cpufunc.h>
> > > +#include <machine/cpuvar.h>
> > >  #include <machine/pmap.h>
> > >  #include <machine/mpbiosvar.h>
> > >  #include <machine/specialreg.h>
> > > @@ -569,7 +570,8 @@ skip_calibration:
> > >            * Now that the timer's calibrated, use the apic timer routines
> > >            * for all our timing needs..
> > >            */
> > > -         delay_func = lapic_delay;
> > > +         if (delay_func != tsc_delay)
> > > +                 delay_func = lapic_delay;
> > >           initclock_func = lapic_initclocks;
> > >   }
> > >  }
> > > Index: include/cpuvar.h
> > > ===================================================================
> > > RCS file: /cvs/src/sys/arch/amd64/include/cpuvar.h,v
> > > retrieving revision 1.10
> > > diff -u -p -r1.10 cpuvar.h
> > > --- include/cpuvar.h      9 Aug 2019 15:20:05 -0000       1.10
> > > +++ include/cpuvar.h      23 Aug 2020 22:59:25 -0000
> > > @@ -102,4 +102,6 @@ void tsc_sync_drift(int64_t);
> > >  void tsc_sync_bp(struct cpu_info *);
> > >  void tsc_sync_ap(struct cpu_info *);
> > >
> > > +void tsc_delay(int);
> > > +
> > >  #endif
> > > Index: include/i82489var.h
> > > ===================================================================
> > > RCS file: /cvs/src/sys/arch/amd64/include/i82489var.h,v
> > > retrieving revision 1.18
> > > diff -u -p -r1.18 i82489var.h
> > > --- include/i82489var.h   4 Oct 2018 05:00:40 -0000       1.18
> > > +++ include/i82489var.h   23 Aug 2020 22:59:26 -0000
> > > @@ -128,4 +128,6 @@ extern void lapic_calibrate_timer(struct
> > >  extern void lapic_startclock(void);
> > >  extern void lapic_initclocks(void);
> > >
> > > +extern void lapic_delay(int);
> > > +
> > >  #endif
> > >
> > >
> >
>

Reply via email to