On Wed, May 13, 2020 at 06:09:52PM +0200, Mark Kettenis wrote:
> > Date: Wed, 13 May 2020 16:55:24 +0100
> > From: Stuart Henderson <s...@spacehopper.org>
> > 
> > Thanks for looking at this Paul!
> > 
> > On 2020/05/13 17:15, Robert Nagy wrote:
> > > On 13/05/20 17:05 +0200, Mark Kettenis wrote:
> > > > > The update currently does the work of clock_gettime(), but it can
> > > > > probably be changed to only update the timehands and move the logic
> > > > > elsewhere. Note that if we expose only the timehands to userland, most
> > > > > of the bintime functionality has to also be made available there. Or 
> > > > > so
> > > > > I think.
> > > > 
> > > > Unfortunately what you're doing here isn't good enough.  You're only
> > > > exporting low-resolution versions of the clocks.  The equivalent of
> > > > what Linux class CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE.
> > > > And I'm fairly certain that isn't what the applications want.  Why
> > > > else would they be calling clock_gettime() a gazillion times per
> > > > second...
> > > 
> > > 
> > > Most of the big programs use CLOCK_MONOTONIC.
> > > 
> > 
> > Agreed.
> > 
> > Quick counts from a dumb search with codesearch.debian:
> > 
> > CLOCK_REALTIME_COARSE       376
> > CLOCK_MONOTONIC_COARSE      639
> > CLOCK_REALTIME              8756
> > CLOCK_MONOTONIC             10776
> > 
> > I have looked over ports source and almost everything I see prefers
> > CLOCK_MONOTONIC if available then falls back to CLOCK_REALTIME.
> > Occasionally you have things using only CLOCK_REALTIME but not many.
> > So I think it's fair to say most of the latter two are overlapping
> > cases.
> > 
> > In linux the vdso handles CLOCK_{REALTIME,MONOTONIC}{,_COARSE}.
> > Depending on the clock source it may still use syscalls though, people
> > got bitten by this on ec2 where some machine types default to a source
> > that still needed syscalls.
> 
> Given the shitshow that the TSC is on amd64, it is unavoidable that we
> end up in that same situation.  In theory we could support alternative
> hardware clocks as well by memory mapping the HPET read-only into
> userland, but that may have unintended side-effects.

The overhead of actually reading the HPET is larger than the overhead
of the syscall itself.  Ditto that for the ACPI timer.  At least on
amd64, this userland approach is only a net savings with the TSC and
all the complexity it entails to keep it monotonic.

Reading the TSC is only slightly more expensive than reading the low-res
timestamps we provide via e.g. getnanouptime(9).  Switch to the HPET and
you've lost your upside.

Attached is a patch to add CLOCK_MONOTONIC_COARSE/CLOCK_REALTIME_COARSE
to the system.

Below is a test program I use to fuss around with comparing timecounters.
I've added a -c flag to switch to the COARSE clocks.

Via SYS_clock_gettime() I'm seeing output like this:

$ doas sysctl kern.timecounter.hardware=tsc
kern.timecounter.hardware: tsc -> tsc
$ command time doas nice -n -20 ./gettime mono
        1.23 real         0.78 user         0.45 sys
$ command time doas nice -n -20 ./gettime -c mono
        1.18 real         0.74 user         0.45 sys
$ doas sysctl kern.timecounter.hardware=acpihpet0 
kern.timecounter.hardware: tsc -> acpihpet0
$ command time doas nice -n -20 ./gettime mono    
        4.73 real         0.82 user         3.91 sys
$ command time doas nice -n -20 ./gettime -c mono 
        1.19 real         0.73 user         0.46 sys

... so the clock_gettime(2) overhead as a syscall on my machine is
substantially smaller than the HPET overhead, unless you have a way to
make reading the HPET faster than it is right now.

--

#include <err.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>

void usage(void);

int
main(int argc, char *argv[])
{
        struct timespec now;
        const char *clockname, *errstr;
        long long i, num;
        clockid_t clock;
        int ch, coarse;

        num = 1000000;
        coarse = 0;

        while ((ch = getopt(argc, argv, "c")) != -1) {
                switch (ch) {
                case 'c':
                        coarse = 1;
                        break;
                case 'n':
                        num = strtonum(optarg, 1, LLONG_MAX, &errstr);
                        if (errstr != NULL)
                                errx(1, "num is %s: %s", errstr, optarg);
                        break;
                default:
                        usage();
                }
        }
        argc -= optind;
        argv += optind;

        if (argc != 1)
                usage();

        clockname = argv[0];

        if (strcmp(clockname, "mono") == 0)
                clock = (coarse) ? CLOCK_MONOTONIC_COARSE : CLOCK_MONOTONIC;
        else if (strcmp(clockname, "real") == 0)
                clock = (coarse) ? CLOCK_REALTIME_COARSE : CLOCK_REALTIME;
        else
                errx(1, "invalid clock: %s", clockname);

        for (i = 0; i < num; i++)
                clock_gettime(clock, &now);

        return 0;
}

void
usage(void)
{
        fprintf(stderr, "usage: %s [-c] [-n num] mono | real\n", getprogname());
        exit(1);
}

--

Index: kern/kern_time.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_time.c,v
retrieving revision 1.127
diff -u -p -r1.127 kern_time.c
--- kern/kern_time.c    20 Mar 2020 04:08:25 -0000      1.127
+++ kern/kern_time.c    13 May 2020 19:17:34 -0000
@@ -114,6 +114,9 @@ clock_gettime(struct proc *p, clockid_t 
        case CLOCK_REALTIME:
                nanotime(tp);
                break;
+       case CLOCK_REALTIME_COARSE:
+               getnanotime(tp);
+               break;
        case CLOCK_UPTIME:
                binuptime(&bt);
                bintimesub(&bt, &naptime, &bt);
@@ -122,6 +125,9 @@ clock_gettime(struct proc *p, clockid_t 
        case CLOCK_MONOTONIC:
        case CLOCK_BOOTTIME:
                nanouptime(tp);
+               break;
+       case CLOCK_MONOTONIC_COARSE:
+               getnanouptime(tp);
                break;
        case CLOCK_PROCESS_CPUTIME_ID:
                nanouptime(tp);
Index: sys/_time.h
===================================================================
RCS file: /cvs/src/sys/sys/_time.h,v
retrieving revision 1.9
diff -u -p -r1.9 _time.h
--- sys/_time.h 18 Dec 2017 05:51:53 -0000      1.9
+++ sys/_time.h 13 May 2020 19:17:34 -0000
@@ -38,6 +38,8 @@
 #define CLOCK_THREAD_CPUTIME_ID                4
 #define CLOCK_UPTIME                   5
 #define CLOCK_BOOTTIME                 6
+#define CLOCK_REALTIME_COARSE          7
+#define CLOCK_MONOTONIC_COARSE         8
 
 #if __BSD_VISIBLE
 /*

Reply via email to