On Tue, Jul 31, 2018 at 2:22 PM, Thomas Gleixner <[email protected]> wrote: > On Mon, 23 Jul 2018, Andy Lutomirski wrote: >> On 07/23/2018 05:55 AM, Fenghua Yu wrote: >> > static void __init init_vdso_funcs_data(void) >> > { >> > + struct system_counterval_t sys_counterval; >> > + >> > if (static_cpu_has(X86_FEATURE_MOVDIRI)) >> > vdso_funcs_data.movdiri_supported = true; >> > if (static_cpu_has(X86_FEATURE_MOVDIR64B)) >> > vdso_funcs_data.movdir64b_supported = true; >> > + if (static_cpu_has(X86_FEATURE_WAITPKG)) >> > + vdso_funcs_data.waitpkg_supported = true; >> > + if (static_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) { >> > + vdso_funcs_data.tsc_known_freq = true; >> > + sys_counterval = convert_art_ns_to_tsc(1); >> > + vdso_funcs_data.tsc_per_nsec = sys_counterval.cycles; >> > + } >> >> You're losing a ton of precision here. You might even be losing *all* of the >> precision and malfunctioning rather badly. > > Indeed. > >> The correct way to do this is: >> >> tsc_counts = ns * mul >> shift; > >> and the vclock code illustrates it. > > Albeit you cannot use the TSC mult/shift pair as that is for the TSC to > nsec conversion. > > To get the proper mult/shift pair use clocks_calc_mult_shift(). Note that > the scaled math has an upper limit when using 64 bit variables. You might > need 128bit scaled math to make it work correctly. > >> convert_art_ns_to_tsc() is a bad example because it uses an expensive >> division operation for no good reason except that no one bothered to >> optimize it. > > Right. It's not a hot path function and it does the job and we would need > 128bit scaled math to avoid mult overflows. > > Aside of that I have no idea why anyone would use convert_art_ns_to_tsc() > for anything else than converting art to nsecs. > >> > +notrace int __vdso_umwait(int state, unsigned long nsec) >> >> __vdso_umwait_relative(), please. Because some day (possibly soon) someone >> will want __vdso_umwait_absolute() and its friend __vdso_read_art_ns() so >> they >> can do: >> >> u64 start = __vdso_read_art_ns(); > > Errm. No. You can't read ART. ART is only used by decives to which it is > distributed. You can only read TSC here and convert that to nsecs.
Bah. But my point remains -- I think that the user (non-vDSO) code should think in nanoseconds, not TSC ticks. That we have have a much better chance of getting migration right. > >> __vdso_umonitor(...); >> ... do something potentially slow or that might fault ... >> __vdso_umwait_absolute(start + timeout); > > That definitely requires 128bit scaled math to work correctly, unless you > make the timeout relative before conversion. > > But I really think we should avoid creating yet another interface to > retrieve TSC time in nsecs. We have enough of these things already. > > Ideally we'd use CLOCK_MONOTONIC here, but that needs more thought as: > > 1) TSC might be disabled as the timekeeping clocksource > > 2) The mult/shift pair for converting to nanoseconds is affected by > NTP/PTP so it can be different from the initial mult/shift pair for > converting nanoseconds to TSC. > > A possible solution would be to use CLOCK_MOTONIC_RAW which is not affected > by NTP/PTP adjustments. But that still has the issue of TSC not being the > timekeeping clocksource. Bah, the whole TSC deadline mode sucks. I have no > idea what's wrong with simple down counters. They Just Work. I think it's not totally crazy to declare UMWAIT on a system with a non-TSC clocksource to be unsupported.

