On Fri, Aug 25, 2017 at 00:40 -0700, Mike Larkin wrote: > On Thu, Aug 24, 2017 at 12:39:33PM +0800, Adam Steen wrote: > > On Thu, Aug 24, 2017 at 2:35 AM, Mike Larkin <mlar...@azathoth.net> wrote: > > > On Wed, Aug 23, 2017 at 09:29:15PM +0800, Adam Steen wrote: > > >> > > >> Thank you Mike on the feedback on the last patch, please see the diff > > >> below, update with your input and style(9) > > >> > > >> I have continued to use tsc as my timecounter and /var/db/ntpd.driff > > >> has stayed under 10. > > >> > > >> cat /var/db/ntpd.drift > > >> 6.665 > > >> > > >> ntpctl -s all > > >> 4/4 peers valid, constraint offset -1s, clock synced, stratum 3 > > >> > > >> peer > > >> wt tl st next poll offset delay jitter > > >> 144.48.166.166 from pool pool.ntp.org > > >> 1 10 2 4s 32s -3.159ms 87.723ms 10.389ms > > >> 13.55.50.68 from pool pool.ntp.org > > >> 1 10 3 11s 32s -3.433ms 86.053ms 18.095ms > > >> 14.202.204.182 from pool pool.ntp.org > > >> 1 10 1 14s 32s 1.486ms 86.545ms 16.483ms > > >> 27.124.125.250 from pool pool.ntp.org > > >> * 1 10 2 12s 30s -10.275ms 54.156ms 70.389ms > > >> > > >> Cheers > > >> Adam > > > > > > IIRC you have an x220, right? > > > > > > If so, could you try letting the clock run for a bit (while using tsc > > > timecounter selection) after apm -L to drop the speed? (make sure > > > apm shows that it dropped). > > > > > > Even though my x230 supposedly has a constant/invar TSC (according to > > > cpuid), the TSC drops from 2.5GHz to 1.2GHz when apm -L runs, which > > > causes time to run too slowly when tsc is selected there. > > > > > > -ml > > > > > > > Yes, x220 > > (bios: LENOVO version "8DET69WW (1.39 )" date 07/18/2013) > > (cpu: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz, 2491.91 MHz) > > > > I took some measurements to before starting the test. > > > > note: the laptop has been up for a few days with apm -A set via > > rc.conf.local > > and sysctl kern.timecounter.hardware as tsc via sysctl.conf and mostly > > idle. > > > > cat /var/db/ntpd.drift > > 6.459 > > > > apm -v > > Battery state: high, 100% remaining, unknown life estimate > > A/C adapter state: connected > > Performance adjustment mode: auto (800 MHz) > > > > 6 hours ago i ran apm -L, verified it was running slowly (800 MHz), > > and got the following results > > > > The clock appears correct (comparing to other computers) > > > > apm -v > > Battery state: high, 100% remaining, unknown life estimate > > A/C adapter state: connected > > Performance adjustment mode: manual (800 MHz) > > > > cat /var/db/ntpd.drift > > 6.385 > > > > ntpctl -s all > > 4/4 peers valid, constraint offset 0s, clock synced, stratum 4 > > > > peer > > wt tl st next poll offset delay jitter > > 203.23.237.200 from pool pool.ntp.org > > 1 10 2 153s 1505s -25.546ms 73.450ms 2.644ms > > 203.114.73.24 from pool pool.ntp.org > > 1 10 2 253s 1560s -1.042ms 75.133ms 0.752ms > > 192.189.54.33 from pool pool.ntp.org > > * 1 10 2 204s 1558s 31.644ms 70.910ms 3.388ms > > 54.252.165.245 from pool pool.ntp.org > > 1 10 2 238s 1518s 0.146ms 73.005ms 2.025ms > > > > I will leave the laptop in lower power mode over the weekend and see > > what happens. > > > > No need, I think you've convinced me that it works :)
But does it actually work on x230 as well? I'm surprised to learn that you've observed TSC frequency change on Ivy Bridge. I was under impression that everything since at least Sandy Bridge (x220) has constant and invariant TSC as advertised. Adam, I've readjusted and simplified your diff a bit. The biggest change is that we can select the reference tc based on it's quality so there's no need to have acpitimer and acpihpet specific functions and variables. There's one big thing missing here: increasing the timecounter quality so that OS can pick it. Something like this: https://github.com/mbelop/src/commit/99d6ef3ae95bbd8ea93c27e0425eb65e5a3359a1 I'd say we should try getting this in after 6.3 unlock unless there are objections. Further cleanup and testing is welcome of course. diff --git sys/arch/amd64/amd64/acpi_machdep.c sys/arch/amd64/amd64/acpi_machdep.c index 17d8fb205ef..f632b838ec2 100644 --- sys/arch/amd64/amd64/acpi_machdep.c +++ sys/arch/amd64/amd64/acpi_machdep.c @@ -67,10 +67,19 @@ extern int acpi_savecpu(void) __returns_twice; #define ACPI_BIOS_RSDP_WINDOW_BASE 0xe0000 #define ACPI_BIOS_RSDP_WINDOW_SIZE 0x20000 u_int8_t *acpi_scan(struct acpi_mem_map *, paddr_t, size_t); +#define RECALIBRATE_MAX_RETRIES 5 +#define RECALIBRATE_SMI_THRESHOLD 50000 +#define RECALIBRATE_DELAY_THRESHOLD 20 + +struct timecounter *recalibrate_tc; + +uint64_t acpi_calibrate_tsc_freq(void); +int acpi_get_tsc_and_timecount(struct timecounter *, uint64_t *, uint64_t *); + int acpi_map(paddr_t pa, size_t len, struct acpi_mem_map *handle) { paddr_t pgpa = trunc_page(pa); paddr_t endpa = round_page(pa + len); @@ -438,10 +447,112 @@ acpi_resume_cpu(struct acpi_softc *sc) /* Re-initialise memory range handling on BSP */ if (mem_range_softc.mr_op != NULL) mem_range_softc.mr_op->initAP(&mem_range_softc); } +int +acpi_get_tsc_and_timecount(struct timecounter *tc, uint64_t *tsc, + uint64_t *count) +{ + uint64_t n, tsc1, tsc2; + int i; + + for (i = 0; i < RECALIBRATE_MAX_RETRIES; i++) { + tsc1 = rdtsc(); + n = (tc->tc_get_timecount(tc) & tc->tc_counter_mask); + tsc2 = rdtsc(); + + if ((tsc2 - tsc1) < RECALIBRATE_SMI_THRESHOLD) { + *count = n; + *tsc = tsc2; + return (0); + } + } + return 1; +} + +void +acpi_recalibrate_tsc(struct timecounter *tc) +{ + if (!recalibrate_tc || + recalibrate_tc->tc_quality < tc->tc_quality) { + recalibrate_tc = tc; + cpu_adjust_tsc_freq(&acpi_calibrate_tsc_freq); + } +} + +static inline uint64_t +acpi_calculate_tsc_freq(uint64_t tsc1, uint64_t tsc2, int usec) +{ + uint64_t delta; + + delta = (tsc2 - tsc1); + return delta * 1000000 / usec; +} + +static inline uint64_t +acpi_calculate_tc_delay(struct timecounter *tc, uint64_t count1, + uint64_t count2) +{ + uint64_t delta; + + if (count2 < count1) + count2 += tc->tc_counter_mask; + + delta = (count2 - count1); + return delta * 1000000 / tc->tc_frequency; +} + +uint64_t +acpi_calibrate_tsc_freq(void) +{ + struct timecounter *tc; + uint64_t count1, count2, frequency, min_freq, tsc1, tsc2; + u_long ef; + int delay_usec, i, err1, err2, usec; + + if (recalibrate_tc == NULL) + return 0; + + tc = recalibrate_tc; + + /* warmup the timers */ + for (i =0; i < 3; i++) { + (void)tc->tc_get_timecount(tc); + (void)rdtsc(); + } + + min_freq = ULLONG_MAX; + + delay_usec = 100000; + for (i = 0; i < 3; i++) { + ef = read_rflags(); + disable_intr(); + + err1 = acpi_get_tsc_and_timecount(tc, &tsc1, &count1); + delay(delay_usec); + err2 = acpi_get_tsc_and_timecount(tc, &tsc2, &count2); + + write_rflags(ef); + + if (err1 || err2) + continue; + + usec = acpi_calculate_tc_delay(tc, count1, count2); + + if ((usec < (delay_usec - RECALIBRATE_DELAY_THRESHOLD)) || + (usec > (delay_usec + RECALIBRATE_DELAY_THRESHOLD))) + continue; + + frequency = acpi_calculate_tsc_freq(tsc1, tsc2, usec); + + min_freq = ulmin(min_freq, frequency); + } + + return min_freq; +} + #ifdef MULTIPROCESSOR void acpi_sleep_mp(void) { int i; diff --git sys/arch/amd64/amd64/identcpu.c sys/arch/amd64/amd64/identcpu.c index a448b885ba7..ddd8557ad3c 100644 --- sys/arch/amd64/amd64/identcpu.c +++ sys/arch/amd64/amd64/identcpu.c @@ -61,10 +61,15 @@ u_int tsc_get_timecount(struct timecounter *tc); struct timecounter tsc_timecounter = { tsc_get_timecount, NULL, ~0u, 0, "tsc", -1000, NULL }; +u_int64_t amd64_tsc_freq; +int amd64_tsc_recalibrate; +u_int64_t (*amd64_tsc_get_freq)(); + +int amd64_has_invariant_tsc; int amd64_has_xcrypt; #ifdef CRYPTO int amd64_has_pclmul; int amd64_has_aesni; #endif @@ -426,12 +431,13 @@ cpu_tsc_freq_ctr(struct cpu_info *ci) } u_int64_t cpu_tsc_freq(struct cpu_info *ci) { - u_int64_t last_count, count; + u_int64_t last_count, count, sum; uint32_t eax, ebx, khz, dummy; + int i; if (!strcmp(cpu_vendor, "GenuineIntel") && cpuid_level >= 0x15) { eax = ebx = khz = dummy = 0; CPUID(0x15, eax, ebx, khz, dummy); @@ -466,23 +472,47 @@ cpu_tsc_freq(struct cpu_info *ci) count = cpu_tsc_freq_ctr(ci); if (count != 0) return (count); - last_count = rdtsc(); - delay(100000); - count = rdtsc(); + if (amd64_tsc_get_freq != NULL) + return (*amd64_tsc_get_freq)(); - return ((count - last_count) * 10); + if (ci->ci_flags & CPUF_PRIMARY) + amd64_tsc_recalibrate = 1; + + sum = 0; + for (i=0; i < 8; i++) { + last_count = rdtsc(); + delay(100000); + count = rdtsc(); + sum += (count - last_count); + } + + return ((sum >> 3) * 10); } u_int tsc_get_timecount(struct timecounter *tc) { return rdtsc(); } +void +cpu_adjust_tsc_freq(u_int64_t (*get_frequency)()) +{ + if (amd64_tsc_recalibrate) { + amd64_tsc_freq = (*get_frequency)(); + tsc_timecounter.tc_frequency = amd64_tsc_freq; + + printf("%s: updated TSC frequency %lld\n", + curcpu()->ci_dev->dv_xname, + tsc_timecounter.tc_frequency); + } else + amd64_tsc_get_freq = get_frequency; +} + void identifycpu(struct cpu_info *ci) { u_int32_t dummy, val; char mycpu_model[48]; @@ -564,13 +594,17 @@ identifycpu(struct cpu_info *ci) } /* Check if it's an invariant TSC */ if (cpu_apmi_edx & CPUIDEDX_ITSC) ci->ci_flags |= CPUF_INVAR_TSC; + + amd64_has_invariant_tsc = (ci->ci_flags & CPUF_INVAR_TSC) != 0; } ci->ci_tsc_freq = cpu_tsc_freq(ci); + if (ci->ci_flags & CPUF_PRIMARY) + amd64_tsc_freq = ci->ci_tsc_freq; amd_cpu_cacheinfo(ci); printf("%s: %s", ci->ci_dev->dv_xname, mycpu_model); diff --git sys/arch/amd64/amd64/machdep.c sys/arch/amd64/amd64/machdep.c index 937094504d1..479dd2f083c 100644 --- sys/arch/amd64/amd64/machdep.c +++ sys/arch/amd64/amd64/machdep.c @@ -423,10 +423,12 @@ bios_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp, */ int cpu_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp, size_t newlen, struct proc *p) { + extern u_int64_t amd64_tsc_freq; + extern int amd64_has_invariant_tsc; extern int amd64_has_xcrypt; dev_t consdev; dev_t dev; int val, error; @@ -494,10 +496,15 @@ cpu_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp, error = sysctl_int(oldp, oldlenp, newp, newlen, &forceukbd); if (forceukbd) pckbc_release_console(); return (error); #endif + case CPU_TSCFREQ: + return (sysctl_rdquad(oldp, oldlenp, newp, amd64_tsc_freq)); + case CPU_INVARIANTTSC: + return (sysctl_rdint(oldp, oldlenp, newp, + amd64_has_invariant_tsc)); default: return (EOPNOTSUPP); } /* NOTREACHED */ } diff --git sys/arch/amd64/include/cpu.h sys/arch/amd64/include/cpu.h index 37e2b490eec..69f0d445772 100644 --- sys/arch/amd64/include/cpu.h +++ sys/arch/amd64/include/cpu.h @@ -425,11 +425,13 @@ void mp_setperf_init(void); #define CPU_CPUFEATURE 8 /* cpuid features */ #define CPU_KBDRESET 10 /* keyboard reset under pcvt */ #define CPU_XCRYPT 12 /* supports VIA xcrypt in userland */ #define CPU_LIDACTION 14 /* action caused by lid close */ #define CPU_FORCEUKBD 15 /* Force ukbd(4) as console keyboard */ -#define CPU_MAXID 16 /* number of valid machdep ids */ +#define CPU_TSCFREQ 16 /* tsc frequency */ +#define CPU_INVARIANTTSC 17 /* has invariant tsc */ +#define CPU_MAXID 18 /* number of valid machdep ids */ #define CTL_MACHDEP_NAMES { \ { 0, 0 }, \ { "console_device", CTLTYPE_STRUCT }, \ { "bios", CTLTYPE_INT }, \ @@ -444,10 +446,12 @@ void mp_setperf_init(void); { 0, 0 }, \ { "xcrypt", CTLTYPE_INT }, \ { 0, 0 }, \ { "lidaction", CTLTYPE_INT }, \ { "forceukbd", CTLTYPE_INT }, \ + { "tscfreq", CTLTYPE_QUAD }, \ + { "invarianttsc", CTLTYPE_INT }, \ } /* * Default cr4 flags. * Doesn't really belong here, but doesn't really belong anywhere else diff --git sys/arch/amd64/include/cpuvar.h sys/arch/amd64/include/cpuvar.h index 24fc8fe880d..edcf1223b82 100644 --- sys/arch/amd64/include/cpuvar.h +++ sys/arch/amd64/include/cpuvar.h @@ -94,7 +94,8 @@ void x86_ipi_init(int); #endif void identifycpu(struct cpu_info *); void cpu_init(struct cpu_info *); void cpu_init_first(void); +void cpu_adjust_tsc_freq(uint64_t (*)()); #endif diff --git sys/dev/acpi/acpihpet.c sys/dev/acpi/acpihpet.c index 17f5e59facf..d4935bd0b9d 100644 --- sys/dev/acpi/acpihpet.c +++ sys/dev/acpi/acpihpet.c @@ -262,10 +262,11 @@ acpihpet_attach(struct device *parent, struct device *self, void *aux) hpet_timecounter.tc_frequency = (u_int32_t)freq; hpet_timecounter.tc_priv = sc; hpet_timecounter.tc_name = sc->sc_dev.dv_xname; tc_init(&hpet_timecounter); + acpi_recalibrate_tsc(&hpet_timecounter); acpihpet_attached++; } u_int acpihpet_gettime(struct timecounter *tc) diff --git sys/dev/acpi/acpitimer.c sys/dev/acpi/acpitimer.c index fb78acf2564..f2a8810f1f3 100644 --- sys/dev/acpi/acpitimer.c +++ sys/dev/acpi/acpitimer.c @@ -93,10 +93,11 @@ acpitimerattach(struct device *parent, struct device *self, void *aux) if (psc->sc_fadt->flags & FADT_TMR_VAL_EXT) acpi_timecounter.tc_counter_mask = 0xffffffffU; acpi_timecounter.tc_priv = sc; acpi_timecounter.tc_name = sc->sc_dev.dv_xname; + acpi_recalibrate_tsc(&acpi_timecounter); tc_init(&acpi_timecounter); } u_int diff --git sys/dev/acpi/acpivar.h sys/dev/acpi/acpivar.h index 558ec12cd40..13d0db1abbf 100644 --- sys/dev/acpi/acpivar.h +++ sys/dev/acpi/acpivar.h @@ -23,10 +23,11 @@ #ifndef _ACPI_WAKECODE #include <sys/timeout.h> #include <sys/rwlock.h> +#include <sys/timetc.h> #include <machine/biosvar.h> #include "acpipwrres.h" /* #define ACPI_DEBUG */ @@ -322,10 +323,12 @@ void acpi_resume_pm(struct acpi_softc *, int); void acpi_resume_clocks(struct acpi_softc *); void acpi_resume_cpu(struct acpi_softc *); void acpi_resume_mp(void); void acpi_sleep_walk(struct acpi_softc *, int); +void acpi_recalibrate_tsc(struct timecounter *); +uint64_t acpi_calibrate_tsc_freq(); #define ACPI_IOREAD 0 #define ACPI_IOWRITE 1 void acpi_wakeup(void *);