Re: userland clock_gettime proof of concept
În 11 iulie 2020 02:27:50 EEST, Mark Kettenis a scris: >> From p...@irofti.net Sat Jul 11 01:23:20 2020 >> Date: Sat, 11 Jul 2020 02:22:33 +0300 >> >> În 11 iulie 2020 02:15:27 EEST, Mark Kettenis > a scris: >> >> Date: Fri, 10 Jul 2020 19:03:58 -0400 >> >> From: George Koehler >> >> >> >> On Wed, 8 Jul 2020 14:26:02 +0200 (CEST) >> >> Mark Kettenis wrote: >> >> >> >> > > From: Paul Irofti >> >> > > Reads OK to me. Please make the adjustments to static >functions >> >that >> >> > > kettenis@ mentioned in the alpha thread. >> >> > >> >> > To add to that: >> >> > >> >> > * TC_LAST isn't needed, so kill that >> >> > * tc_get_timecount >> >> > >> >> > Also in the sparc64 I did an exact copy of the kernel >> >implementation >> >> > of the functions to read the counter. I only made them static >> >inline. >> >> > That makes it easier to verify that they are indeed identical. >> >> >> >> Here is the diff for macppc after I drop TC_LAST, recopy usertc.c >> >from >> >> amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from >> >> /sys/arch/powerpc/include/cpu.h >> >> >> >> OK to commit? >> >> >> >> Index: lib/libc/arch/powerpc/gen/usertc.c >> >> >=== >> >> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v >> >> retrieving revision 1.1 >> >> diff -u -p -r1.1 usertc.c >> >> --- lib/libc/arch/powerpc/gen/usertc.c6 Jul 2020 13:33:05 >- 1.1 >> >> +++ lib/libc/arch/powerpc/gen/usertc.c9 Jul 2020 21:41:47 - >> >> @@ -1,4 +1,4 @@ >> >> -/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ >> >> */ >> >> +/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */ >> >> /* >> >> * Copyright (c) 2020 Paul Irofti >> >> * >> >> @@ -18,4 +18,24 @@ >> >> #include >> >> #include >> >> >> >> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = >NULL; >> >> +static __inline u_int32_t >> >> +ppc_mftbl (void) >> >> +{ >> >> + int ret; >> >> + __asm volatile ("mftb %0" : "=r" (ret)); >> >> + return ret; >> >> +} >> >> + >> >> +static int >> > >> >That should be u_int. I now see that this is broken in the amd64 >> >version as well. >> >> I don't think this should be u_int. Can you explain why? It is the >> function error status and can return a negative value. It is not the >> tc. > >Ugh, you're right. Brainfart. Time to get some zzz. > >Diff is ok kettenis@ as-is. Heh, no worries, happens to the best of us. OK pirofti@ Zzz time for me as well. > >> >> +tc_get_timecount(struct timekeep *tk, u_int *tc) >> >> +{ >> >> + switch (tk->tk_user) { >> >> + case TC_TB: >> >> + *tc = ppc_mftbl(); >> >> + return 0; >> >> + } >> >> + >> >> + return -1; >> >> +} >> >> + >> >> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = >> >tc_get_timecount; >> >> Index: sys/arch/macppc/include/timetc.h >> >> >=== >> >> RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v >> >> retrieving revision 1.1 >> >> diff -u -p -r1.1 timetc.h >> >> --- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 - >> >> 1.1 >> >> +++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 - >> >> @@ -18,6 +18,6 @@ >> >> #ifndef _MACHINE_TIMETC_H_ >> >> #define _MACHINE_TIMETC_H_ >> >> >> >> -#define TC_LAST 0 >> >> +#define TC_TB 1 >> >> >> >> #endif /* _MACHINE_TIMETC_H_ */ >> >> Index: sys/arch/macppc/macppc/clock.c >> >> >=== >> >> RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v >> >> retrieving revision 1.44 >> >> diff -u -p -r1.44 clock.c >> >> --- sys/arch/macppc/macppc/clock.c6 Jul 2020 13:33:08 - >> >> 1.44 >> >> +++ sys/arch/macppc/macppc/clock.c9 Jul 2020 21:41:48 - >> >> @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320; >> >> static int32_t ticks_per_intr; >> >> >> >> static struct timecounter tb_timecounter = { >> >> - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 >> >> + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB >> >> }; >> >> >> >> /* calibrate the timecounter frequency for the listed models */ >> >> >>
Re: userland clock_gettime proof of concept
> From p...@irofti.net Sat Jul 11 01:23:20 2020 > Date: Sat, 11 Jul 2020 02:22:33 +0300 > > În 11 iulie 2020 02:15:27 EEST, Mark Kettenis a > scris: > >> Date: Fri, 10 Jul 2020 19:03:58 -0400 > >> From: George Koehler > >> > >> On Wed, 8 Jul 2020 14:26:02 +0200 (CEST) > >> Mark Kettenis wrote: > >> > >> > > From: Paul Irofti > >> > > Reads OK to me. Please make the adjustments to static functions > >that > >> > > kettenis@ mentioned in the alpha thread. > >> > > >> > To add to that: > >> > > >> > * TC_LAST isn't needed, so kill that > >> > * tc_get_timecount > >> > > >> > Also in the sparc64 I did an exact copy of the kernel > >implementation > >> > of the functions to read the counter. I only made them static > >inline. > >> > That makes it easier to verify that they are indeed identical. > >> > >> Here is the diff for macppc after I drop TC_LAST, recopy usertc.c > >from > >> amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from > >> /sys/arch/powerpc/include/cpu.h > >> > >> OK to commit? > >> > >> Index: lib/libc/arch/powerpc/gen/usertc.c > >> === > >> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v > >> retrieving revision 1.1 > >> diff -u -p -r1.1 usertc.c > >> --- lib/libc/arch/powerpc/gen/usertc.c 6 Jul 2020 13:33:05 - > >> 1.1 > >> +++ lib/libc/arch/powerpc/gen/usertc.c 9 Jul 2020 21:41:47 - > >> @@ -1,4 +1,4 @@ > >> -/*$OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ > >> */ > >> +/*$OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */ > >> /* > >> * Copyright (c) 2020 Paul Irofti > >> * > >> @@ -18,4 +18,24 @@ > >> #include > >> #include > >> > >> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; > >> +static __inline u_int32_t > >> +ppc_mftbl (void) > >> +{ > >> + int ret; > >> + __asm volatile ("mftb %0" : "=r" (ret)); > >> + return ret; > >> +} > >> + > >> +static int > > > >That should be u_int. I now see that this is broken in the amd64 > >version as well. > > I don't think this should be u_int. Can you explain why? It is the > function error status and can return a negative value. It is not the > tc. Ugh, you're right. Brainfart. Time to get some zzz. Diff is ok kettenis@ as-is. > >> +tc_get_timecount(struct timekeep *tk, u_int *tc) > >> +{ > >> + switch (tk->tk_user) { > >> + case TC_TB: > >> + *tc = ppc_mftbl(); > >> + return 0; > >> + } > >> + > >> + return -1; > >> +} > >> + > >> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = > >tc_get_timecount; > >> Index: sys/arch/macppc/include/timetc.h > >> === > >> RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v > >> retrieving revision 1.1 > >> diff -u -p -r1.1 timetc.h > >> --- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 - > >> 1.1 > >> +++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 - > >> @@ -18,6 +18,6 @@ > >> #ifndef _MACHINE_TIMETC_H_ > >> #define _MACHINE_TIMETC_H_ > >> > >> -#define TC_LAST 0 > >> +#define TC_TB 1 > >> > >> #endif/* _MACHINE_TIMETC_H_ */ > >> Index: sys/arch/macppc/macppc/clock.c > >> === > >> RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v > >> retrieving revision 1.44 > >> diff -u -p -r1.44 clock.c > >> --- sys/arch/macppc/macppc/clock.c 6 Jul 2020 13:33:08 - 1.44 > >> +++ sys/arch/macppc/macppc/clock.c 9 Jul 2020 21:41:48 - > >> @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320; > >> static int32_t ticks_per_intr; > >> > >> static struct timecounter tb_timecounter = { > >> - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 > >> + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB > >> }; > >> > >> /* calibrate the timecounter frequency for the listed models */ > >> >
Re: userland clock_gettime proof of concept
În 11 iulie 2020 02:15:27 EEST, Mark Kettenis a scris: >> Date: Fri, 10 Jul 2020 19:03:58 -0400 >> From: George Koehler >> >> On Wed, 8 Jul 2020 14:26:02 +0200 (CEST) >> Mark Kettenis wrote: >> >> > > From: Paul Irofti >> > > Reads OK to me. Please make the adjustments to static functions >that >> > > kettenis@ mentioned in the alpha thread. >> > >> > To add to that: >> > >> > * TC_LAST isn't needed, so kill that >> > * tc_get_timecount >> > >> > Also in the sparc64 I did an exact copy of the kernel >implementation >> > of the functions to read the counter. I only made them static >inline. >> > That makes it easier to verify that they are indeed identical. >> >> Here is the diff for macppc after I drop TC_LAST, recopy usertc.c >from >> amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from >> /sys/arch/powerpc/include/cpu.h >> >> OK to commit? >> >> Index: lib/libc/arch/powerpc/gen/usertc.c >> === >> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v >> retrieving revision 1.1 >> diff -u -p -r1.1 usertc.c >> --- lib/libc/arch/powerpc/gen/usertc.c 6 Jul 2020 13:33:05 - >> 1.1 >> +++ lib/libc/arch/powerpc/gen/usertc.c 9 Jul 2020 21:41:47 - >> @@ -1,4 +1,4 @@ >> -/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ */ >> +/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */ >> /* >> * Copyright (c) 2020 Paul Irofti >> * >> @@ -18,4 +18,24 @@ >> #include >> #include >> >> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; >> +static __inline u_int32_t >> +ppc_mftbl (void) >> +{ >> +int ret; >> +__asm volatile ("mftb %0" : "=r" (ret)); >> +return ret; >> +} >> + >> +static int > >That should be u_int. I now see that this is broken in the amd64 >version as well. I don't think this should be u_int. Can you explain why? It is the function error status and can return a negative value. It is not the tc. > >Otherwise this is ok kettenis@ > >> +tc_get_timecount(struct timekeep *tk, u_int *tc) >> +{ >> +switch (tk->tk_user) { >> +case TC_TB: >> +*tc = ppc_mftbl(); >> +return 0; >> +} >> + >> +return -1; >> +} >> + >> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = >tc_get_timecount; >> Index: sys/arch/macppc/include/timetc.h >> === >> RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v >> retrieving revision 1.1 >> diff -u -p -r1.1 timetc.h >> --- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 - 1.1 >> +++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 - >> @@ -18,6 +18,6 @@ >> #ifndef _MACHINE_TIMETC_H_ >> #define _MACHINE_TIMETC_H_ >> >> -#define TC_LAST 0 >> +#define TC_TB 1 >> >> #endif /* _MACHINE_TIMETC_H_ */ >> Index: sys/arch/macppc/macppc/clock.c >> === >> RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v >> retrieving revision 1.44 >> diff -u -p -r1.44 clock.c >> --- sys/arch/macppc/macppc/clock.c 6 Jul 2020 13:33:08 - 1.44 >> +++ sys/arch/macppc/macppc/clock.c 9 Jul 2020 21:41:48 - >> @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320; >> static int32_t ticks_per_intr; >> >> static struct timecounter tb_timecounter = { >> -tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 >> +tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB >> }; >> >> /* calibrate the timecounter frequency for the listed models */ >>
Re: userland clock_gettime proof of concept
> Date: Fri, 10 Jul 2020 19:03:58 -0400 > From: George Koehler > > On Wed, 8 Jul 2020 14:26:02 +0200 (CEST) > Mark Kettenis wrote: > > > > From: Paul Irofti > > > Reads OK to me. Please make the adjustments to static functions that > > > kettenis@ mentioned in the alpha thread. > > > > To add to that: > > > > * TC_LAST isn't needed, so kill that > > * tc_get_timecount > > > > Also in the sparc64 I did an exact copy of the kernel implementation > > of the functions to read the counter. I only made them static inline. > > That makes it easier to verify that they are indeed identical. > > Here is the diff for macppc after I drop TC_LAST, recopy usertc.c from > amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from > /sys/arch/powerpc/include/cpu.h > > OK to commit? > > Index: lib/libc/arch/powerpc/gen/usertc.c > === > RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v > retrieving revision 1.1 > diff -u -p -r1.1 usertc.c > --- lib/libc/arch/powerpc/gen/usertc.c6 Jul 2020 13:33:05 - > 1.1 > +++ lib/libc/arch/powerpc/gen/usertc.c9 Jul 2020 21:41:47 - > @@ -1,4 +1,4 @@ > -/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ */ > +/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */ > /* > * Copyright (c) 2020 Paul Irofti > * > @@ -18,4 +18,24 @@ > #include > #include > > -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; > +static __inline u_int32_t > +ppc_mftbl (void) > +{ > + int ret; > + __asm volatile ("mftb %0" : "=r" (ret)); > + return ret; > +} > + > +static int That should be u_int. I now see that this is broken in the amd64 version as well. Otherwise this is ok kettenis@ > +tc_get_timecount(struct timekeep *tk, u_int *tc) > +{ > + switch (tk->tk_user) { > + case TC_TB: > + *tc = ppc_mftbl(); > + return 0; > + } > + > + return -1; > +} > + > +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = > tc_get_timecount; > Index: sys/arch/macppc/include/timetc.h > === > RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v > retrieving revision 1.1 > diff -u -p -r1.1 timetc.h > --- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 - 1.1 > +++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 - > @@ -18,6 +18,6 @@ > #ifndef _MACHINE_TIMETC_H_ > #define _MACHINE_TIMETC_H_ > > -#define TC_LAST 0 > +#define TC_TB 1 > > #endif /* _MACHINE_TIMETC_H_ */ > Index: sys/arch/macppc/macppc/clock.c > === > RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v > retrieving revision 1.44 > diff -u -p -r1.44 clock.c > --- sys/arch/macppc/macppc/clock.c6 Jul 2020 13:33:08 - 1.44 > +++ sys/arch/macppc/macppc/clock.c9 Jul 2020 21:41:48 - > @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320; > static int32_t ticks_per_intr; > > static struct timecounter tb_timecounter = { > - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 > + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB > }; > > /* calibrate the timecounter frequency for the listed models */ >
Re: userland clock_gettime proof of concept
On Wed, 8 Jul 2020 14:26:02 +0200 (CEST) Mark Kettenis wrote: > > From: Paul Irofti > > Reads OK to me. Please make the adjustments to static functions that > > kettenis@ mentioned in the alpha thread. > > To add to that: > > * TC_LAST isn't needed, so kill that > * tc_get_timecount > > Also in the sparc64 I did an exact copy of the kernel implementation > of the functions to read the counter. I only made them static inline. > That makes it easier to verify that they are indeed identical. Here is the diff for macppc after I drop TC_LAST, recopy usertc.c from amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from /sys/arch/powerpc/include/cpu.h OK to commit? Index: lib/libc/arch/powerpc/gen/usertc.c === RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v retrieving revision 1.1 diff -u -p -r1.1 usertc.c --- lib/libc/arch/powerpc/gen/usertc.c 6 Jul 2020 13:33:05 - 1.1 +++ lib/libc/arch/powerpc/gen/usertc.c 9 Jul 2020 21:41:47 - @@ -1,4 +1,4 @@ -/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ */ +/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */ /* * Copyright (c) 2020 Paul Irofti * @@ -18,4 +18,24 @@ #include #include -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; +static __inline u_int32_t +ppc_mftbl (void) +{ + int ret; + __asm volatile ("mftb %0" : "=r" (ret)); + return ret; +} + +static int +tc_get_timecount(struct timekeep *tk, u_int *tc) +{ + switch (tk->tk_user) { + case TC_TB: + *tc = ppc_mftbl(); + return 0; + } + + return -1; +} + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = tc_get_timecount; Index: sys/arch/macppc/include/timetc.h === RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v retrieving revision 1.1 diff -u -p -r1.1 timetc.h --- sys/arch/macppc/include/timetc.h6 Jul 2020 13:33:07 - 1.1 +++ sys/arch/macppc/include/timetc.h9 Jul 2020 21:41:48 - @@ -18,6 +18,6 @@ #ifndef _MACHINE_TIMETC_H_ #define _MACHINE_TIMETC_H_ -#defineTC_LAST 0 +#defineTC_TB 1 #endif /* _MACHINE_TIMETC_H_ */ Index: sys/arch/macppc/macppc/clock.c === RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v retrieving revision 1.44 diff -u -p -r1.44 clock.c --- sys/arch/macppc/macppc/clock.c 6 Jul 2020 13:33:08 - 1.44 +++ sys/arch/macppc/macppc/clock.c 9 Jul 2020 21:41:48 - @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320; static int32_t ticks_per_intr; static struct timecounter tb_timecounter = { - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB }; /* calibrate the timecounter frequency for the listed models */
Re: userland clock_gettime proof of concept
On Wed, 8 Jul 2020 11:32:09 -0500 Scott Cheloha wrote: > On Wed, Jul 08, 2020 at 09:36:03AM -0600, Theo de Raadt wrote: > > Ugly test program (not showing it) seems to show syncronized clocks on > > powerpc, or at least closely sycronized. It might be one register. > > I guess I'll share mine, then. > > With this program I can consistently catch an offset of -2 to -1 on my > G5. Such a small offset is probably fine, right? I'm curious if it > gets any larger on other machines. /sys/arch/macppc/macppc/cpu.c has code to "Sync timebase" of the cpus. One cpu does h->tb = ppc_mftb() + 10; other does ppc_mttb(h->tb). Your -2 to -1 implies that time went backwards! I don't like this, but time also goes backwards in OpenBSD/amd64 and in other systems. (Microsoft Docs, "Acquiring high-resolution time stamps", says that stamps "from different threads", within a small range of uncertainty, have "ambiguous ordering".)
Re: userland clock_gettime proof of concept
On Wed, Jul 08, 2020 at 09:36:03AM -0600, Theo de Raadt wrote: > Christian Weisgerber wrote: > > > On 2020-06-26, George Koehler wrote: > > > > > Here's macppc again. My macppc isn't using your newest diff but does > > > now need to define TC_TB in . > > > > Crucial question: How confident are we that TB is in sync on > > multiprocessor machines? > > Ugly test program (not showing it) seems to show syncronized clocks on > powerpc, or at least closely sycronized. It might be one register. I guess I'll share mine, then. With this program I can consistently catch an offset of -2 to -1 on my G5. Such a small offset is probably fine, right? I'm curious if it gets any larger on other machines. peanut$ cc -o tb-sync-check tb-sync-check.c -lpthread peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 91661686941 global 91661686943 (-2) peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 91721148999 global 91721149000 (-1) peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 91773944538 global 91773944539 (-1) peanut$ ./tb-sync-check tb-sync-check: T2: regression: local 91839284894 global 91839284895 (-1) peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 91885196234 global 91885196235 (-1) peanut$ ./tb-sync-check tb-sync-check: T2: regression: local 92460898355 global 92460898356 (-1) peanut$ ./tb-sync-check tb-sync-check: T2: regression: local 92553793423 global 92553793424 (-1) peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 92602154422 global 92602154423 (-1) peanut$ ./tb-sync-check tb-sync-check: T1: regression: local 92643366617 global 92643366619 (-2) $ $ cat tb-sync-check.c #include #include #include #include #include #include uint64_t tb_global_max; pthread_mutex_t tb_mtx = PTHREAD_MUTEX_INITIALIZER; volatile sig_atomic_t uninterrupted = 1; void handle_int(int); uint64_t mftb(void); void *tb_sync_test(void *); int main(int argc, char *argv[]) { pthread_t thread[2]; sigset_t empty; int error, i; signal(SIGINT, handle_int); tb_global_max = mftb(); for (i = 0; i < nitems(thread); i++) { error = pthread_create(&thread[i], NULL, tb_sync_test, (void *)(i + 1)); if (error) errc(1, error, "pthread_create"); } sigemptyset(&empty); sigsuspend(&empty); for (i = 0; i < nitems(thread); i++) { error = pthread_join(thread[i], NULL); if (error) errc(1, error, "pthread_join"); } return 0; } void handle_int(int signo) { uninterrupted = 0; } /* * Cribbed from sys/arch/powerpc/include/cpu.h. * * Hopefully it's correct. */ uint64_t mftb(void) { u_long scratch; u_int64_t tb; asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;" " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch)); return tb; } void * tb_sync_test(void *id) { uint64_t tb_local; unsigned int thread_id = (unsigned int)id; while (uninterrupted) { pthread_mutex_lock(&tb_mtx); tb_local = mftb(); /* * Does the local TimeBase lag behind the global maximum? */ if (tb_local < tb_global_max) { errx(1, "T%u: regression: local %llu global %llu (%lld)", thread_id, (unsigned long long)tb_local, (unsigned long long)tb_global_max, (long long)(tb_local - tb_global_max)); } /* * Update the global maximum. */ tb_global_max = tb_local; pthread_mutex_unlock(&tb_mtx); } return NULL; }
Re: userland clock_gettime proof of concept
Christian Weisgerber wrote: > On 2020-06-26, George Koehler wrote: > > > Here's macppc again. My macppc isn't using your newest diff but does > > now need to define TC_TB in . > > Crucial question: How confident are we that TB is in sync on > multiprocessor machines? Ugly test program (not showing it) seems to show syncronized clocks on powerpc, or at least closely sycronized. It might be one register.
Re: userland clock_gettime proof of concept
Christian Weisgerber wrote: > On 2020-06-26, George Koehler wrote: > > > Here's macppc again. My macppc isn't using your newest diff but does > > now need to define TC_TB in . > > Crucial question: How confident are we that TB is in sync on > multiprocessor machines? A few small test programs exposed the problem on alpha for me. This one identified the problem in a glaring fashion. The upper word of the register is per-cpu, and it toggles back and forth upon each nanosleep call, as we return to a different cpu. (Quite obviously the scheduler has zero afinity) A slightly different test program was able to identify occasions when returning to the other cpu would return a clock value which was earlier. A variation of test_time_skew.c, I don't have it handy. Maybe this can be quickly convered for macppc, to test? I don't understand yet how to fix alpha. It looks like we need a mechanism to know what cpu# we are on, and a table for the variation. (The upper word of the alpha clock register assists with profiling it seems, so I don't think we can put a delta there) #include #include #include main() { struct timespec s = {0, 0}; long times[40]; volatile unsigned long t; int i; for (i = 0; i < sizeof times / sizeof times[0]; i++) { __asm volatile("rpcc %0" : "=r" (t)); times[i] = t; nanosleep(&s, NULL); } for (i = 0; i < sizeof times / sizeof times[0]; i++) { t = times[i]; printf("%lx %x\n", (t >> 32) & 0x, t & 0x); } }
Re: userland clock_gettime proof of concept
On 2020-06-26, George Koehler wrote: > Here's macppc again. My macppc isn't using your newest diff but does > now need to define TC_TB in . Crucial question: How confident are we that TB is in sync on multiprocessor machines? -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
> From: Paul Irofti > Date: Wed, 8 Jul 2020 12:05:04 +0300 > > On 2020-06-26 06:22, George Koehler wrote: > > On Mon, 22 Jun 2020 19:12:22 +0300 > > Paul Irofti wrote: > > > >> New iteration: > >> > >>- ps_timekeep should not coredump, pointed by deraadt@ > >>- set ps_timekeep to 0 before user uvm_map for randomization > >>- map timekeep before fixup. confirmed by naddy@ that it fixes NULL init > >>- initialize va. clarified by kettenis@ > > > > Here's macppc again. My macppc isn't using your newest diff but does > > now need to define TC_TB in . > > > > The /sys/arch/powerpc/include/timetc.h in your diff never gets used, > > because there is no #include . On macppc, > >uname -m => macppc and > >uname -p => powerpcare different, > > and #include is /sys/arch/macppc/include/timetc.h. > > I suspect that is /sys/arch/$i/include/timetc.h > > if and only if /sys/arch/$i/compile exists. > > > > 10 days ago, naddy said, "You only need the lower register." That is > > correct, so this diff also stops using mftbu (the higher register). > > Reads OK to me. Please make the adjustments to static functions that > kettenis@ mentioned in the alpha thread. To add to that: * TC_LAST isn't needed, so kill that * tc_get_timecount Also in the sparc64 I did an exact copy of the kernel implementation of the functions to read the counter. I only made them static inline. That makes it easier to verify that they are indeed identical. > > --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 16:42:36 2020 > > +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun 24 16:46:00 2020 > > @@ -18,4 +18,17 @@ > > #include > > #include > > > > -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; > > +int > > +tc_get_timecount(struct timekeep *tk, u_int *tc) > > +{ > > + u_int tb; > > + > > + if (tk->tk_user != TC_TB) > > + return -1; > > + > > + asm volatile("mftb %0" : "=r"(tb)); > > + *tc = tb; > > + return 0; > > +} > > +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc) > > + = tc_get_timecount; > > --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 16:36:03 2020 > > +++ sys/arch/macppc/include/timetc.hWed Jun 24 16:37:47 2020 > > @@ -18,6 +18,7 @@ > > #ifndef _MACHINE_TIMETC_H_ > > #define _MACHINE_TIMETC_H_ > > > > -#defineTC_LAST 0 > > +#defineTC_TB 1 > > +#defineTC_LAST 2 > > > > #endif/* _MACHINE_TIMETC_H_ */ > > --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 2020 > > +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 2020 > > @@ -57,7 +57,7 @@ > > static int32_t ticks_per_intr; > > > > static struct timecounter tb_timecounter = { > > - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 > > + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB > > }; > > > > /* calibrate the timecounter frequency for the listed models */ > > > >
Re: userland clock_gettime proof of concept
On 2020-07-08 01:09, Theo de Raadt wrote: The /sys/arch/powerpc/include/timetc.h in your diff never gets used, because there is no #include . On macppc, I am fixing this issue for all the architectures, just being careful by doing builds first. Thank you for handling this.
Re: userland clock_gettime proof of concept
On 2020-06-26 06:22, George Koehler wrote: On Mon, 22 Jun 2020 19:12:22 +0300 Paul Irofti wrote: New iteration: - ps_timekeep should not coredump, pointed by deraadt@ - set ps_timekeep to 0 before user uvm_map for randomization - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init - initialize va. clarified by kettenis@ Here's macppc again. My macppc isn't using your newest diff but does now need to define TC_TB in . The /sys/arch/powerpc/include/timetc.h in your diff never gets used, because there is no #include . On macppc, uname -m => macppc and uname -p => powerpcare different, and #include is /sys/arch/macppc/include/timetc.h. I suspect that is /sys/arch/$i/include/timetc.h if and only if /sys/arch/$i/compile exists. 10 days ago, naddy said, "You only need the lower register." That is correct, so this diff also stops using mftbu (the higher register). Reads OK to me. Please make the adjustments to static functions that kettenis@ mentioned in the alpha thread. --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 16:42:36 2020 +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun 24 16:46:00 2020 @@ -18,4 +18,17 @@ #include #include -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; +int +tc_get_timecount(struct timekeep *tk, u_int *tc) +{ + u_int tb; + + if (tk->tk_user != TC_TB) + return -1; + + asm volatile("mftb %0" : "=r"(tb)); + *tc = tb; + return 0; +} +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc) + = tc_get_timecount; --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 16:36:03 2020 +++ sys/arch/macppc/include/timetc.hWed Jun 24 16:37:47 2020 @@ -18,6 +18,7 @@ #ifndef _MACHINE_TIMETC_H_ #define _MACHINE_TIMETC_H_ -#define TC_LAST 0 +#defineTC_TB 1 +#defineTC_LAST 2 #endif /* _MACHINE_TIMETC_H_ */ --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 2020 +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 2020 @@ -57,7 +57,7 @@ static int32_t ticks_per_intr; static struct timecounter tb_timecounter = { - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB }; /* calibrate the timecounter frequency for the listed models */
Re: userland clock_gettime proof of concept
> The /sys/arch/powerpc/include/timetc.h in your diff never gets used, > because there is no #include . On macppc, I am fixing this issue for all the architectures, just being careful by doing builds first.
Re: userland clock_gettime proof of concept
Hi George, On Thu, 25 Jun 2020 23:22:36 -0400 George Koehler wrote: > On Mon, 22 Jun 2020 19:12:22 +0300 > Paul Irofti wrote: > > > New iteration: > > > > - ps_timekeep should not coredump, pointed by deraadt@ > > - set ps_timekeep to 0 before user uvm_map for randomization > > - map timekeep before fixup. confirmed by naddy@ that it fixes > > NULL init > > - initialize va. clarified by kettenis@ > > Here's macppc again. My macppc isn't using your newest diff but does > now need to define TC_TB in . > > The /sys/arch/powerpc/include/timetc.h in your diff never gets used, > because there is no #include . On macppc, > uname -m => macppc and > uname -p => powerpcare different, > and #include is /sys/arch/macppc/include/timetc.h. > I suspect that is /sys/arch/$i/include/timetc.h > if and only if /sys/arch/$i/compile exists. > > 10 days ago, naddy said, "You only need the lower register." That is > correct, so this diff also stops using mftbu (the higher register). > > --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 > 16:42:36 2020 +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun > 24 16:46:00 2020 @@ -18,4 +18,17 @@ > #include > #include > > -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; > +int > +tc_get_timecount(struct timekeep *tk, u_int *tc) > +{ > + u_int tb; > + > + if (tk->tk_user != TC_TB) > + return -1; > + > + asm volatile("mftb %0" : "=r"(tb)); > + *tc = tb; > + return 0; > +} > +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc) > + = tc_get_timecount; > --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 > 16:36:03 2020 +++ sys/arch/macppc/include/timetc.hWed Jun 24 > 16:37:47 2020 @@ -18,6 +18,7 @@ > #ifndef _MACHINE_TIMETC_H_ > #define _MACHINE_TIMETC_H_ > > -#define TC_LAST 0 > +#define TC_TB 1 > +#define TC_LAST 2 > > #endif /* _MACHINE_TIMETC_H_ */ > --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 > 2020 +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 > 2020 @@ -57,7 +57,7 @@ > static int32_t ticks_per_intr; > > static struct timecounter tb_timecounter = { > - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 > + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB > }; > > /* calibrate the timecounter frequency for the listed models */ > I've applied that diff against what is committed to cvs already, it works fine on my PowerBook G4 A1138. I'm building some ports in parallel on that machine at the moment, and met no issues so far. Thanks a lot for this one, i can see speed improvements :) Charlène.
Re: userland clock_gettime proof of concept
Sorry for late reaction. At least VirtualBox based virtual machines started to panic with the recent kernel. I reverted your diff and panics stopped. Screenshot attached.
Re: userland clock_gettime proof of concept
> On 5 Jul 2020, at 20:31, Paul Irofti wrote: > > On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: >> >> >> În 3 iulie 2020 17:55:25 EEST, Mark Kettenis a >> scris: Date: Fri, 3 Jul 2020 15:13:22 +0200 From: Robert Nagy On 02/07/20 00:31 +0100, Stuart Henderson wrote: > running on 38 of these, btw. been running with this on all my workstations and laptops and on 3 >>> build servers as well >>> >>> Are the issue that naddy@ saw solved? >>> >>> Did anybody do a *proper* test on anything besides amd64? Especially >>> on architectures where the optimized clock_gettime is *not* available? >> >> Yes and yes. > > So, can we go ahead with this? > Sorry for late reaction. At least VirtualBox based virtual machines started to panic with the recent kernel. I reverted your diff and panics stopped. Screenshot attached.
Re: userland clock_gettime proof of concept
Ah, it was seen. But everyone has too much memory these days. Vitaliy Makkoveev wrote: > Sorry for late reaction. At least VirtualBox based virtual machines > started to panic with the recent kernel. I reverted your diff and panics > stopped. > Screenshot attached.
Re: userland clock_gettime proof of concept
> From: Vitaliy Makkoveev > Date: Tue, 7 Jul 2020 01:34:33 +0300 > > > On 5 Jul 2020, at 20:31, Paul Irofti wrote: > > > > On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: > >> > >> > >> În 3 iulie 2020 17:55:25 EEST, Mark Kettenis a > >> scris: > Date: Fri, 3 Jul 2020 15:13:22 +0200 > From: Robert Nagy > > On 02/07/20 00:31 +0100, Stuart Henderson wrote: > > running on 38 of these, btw. > > been running with this on all my workstations and laptops and on 3 > >>> build > servers as well > >>> > >>> Are the issue that naddy@ saw solved? > >>> > >>> Did anybody do a *proper* test on anything besides amd64? Especially > >>> on architectures where the optimized clock_gettime is *not* available? > >> > >> Yes and yes. > > > > So, can we go ahead with this? > > > > Sorry for late reaction. At least VirtualBox based virtual machines > started to panic with the recent kernel. I reverted your diff and > panics stopped. Should already be fixed.
Re: userland clock_gettime proof of concept
On Sun, Jul 05, 2020 at 08:31:19PM +0300, Paul Irofti wrote: > On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: > > > > ?n 3 iulie 2020 17:55:25 EEST, Mark Kettenis a > > scris: > > >> Date: Fri, 3 Jul 2020 15:13:22 +0200 > > >> From: Robert Nagy > > >> > > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: > > >> > running on 38 of these, btw. > > >> > > >> been running with this on all my workstations and laptops and on 3 > > >build > > >> servers as well > > > > > >Are the issue that naddy@ saw solved? > > > > > >Did anybody do a *proper* test on anything besides amd64? Especially > > >on architectures where the optimized clock_gettime is *not* available? > > > > Yes and yes. > > So, can we go ahead with this? ok cheloha@ I'd like to document how all this works somewhere. I think I'm going to update tc_init.9. Maybe we can add a section for "userspace timecounter drivers" to that page after it is up-to-date with the state of timecounting in the kernel. Also, I sanity-tested your most recent diff on this arm64: OpenBSD 6.7-current (GENERIC.MP) #1: Fri Jul 3 13:04:15 CDT 2020 ssc@kitsch.local:/usr/src/sys/arch/arm64/compile/GENERIC.MP real mem = 3073093632 (2930MB) avail mem = 2946428928 (2809MB) random: good seed from bootblocks mainbus0 at root: ACPI psci0 at mainbus0: PSCI 1.1, SMCCC 1.2 cpu0 at mainbus0 mpidr 0: ARM Cortex-A72 r0p3 cpu0: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu0: 1024KB 64b/line 16-way L2 cache cpu1 at mainbus0 mpidr 1: ARM Cortex-A72 r0p3 cpu1: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu1: 1024KB 64b/line 16-way L2 cache cpu2 at mainbus0 mpidr 2: ARM Cortex-A72 r0p3 cpu2: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu2: 1024KB 64b/line 16-way L2 cache cpu3 at mainbus0 mpidr 3: ARM Cortex-A72 r0p3 cpu3: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu3: 1024KB 64b/line 16-way L2 cache efi0 at mainbus0: UEFI 2.7 efi0: https://github.com/pftf/RPi4 rev 0x1 smbios0 at efi0: SMBIOS 3.3.0 smbios0: vendor https://github.com/pftf/RPi4 version "UEFI Firmware v1.16" date Jun 18 2020 12:20:53 smbios0: Sony UK Raspberry Pi 4 Model B apm0 at mainbus0 ampintc0 at mainbus0 nirq 256, ncpu 4 ipi: 0, 1: "interrupt-controller" agtimer0 at mainbus0: tick rate 54000 KHz acpi0 at mainbus0: ACPI 6.3 acpi0: sleep states acpi0: tables DSDT FACP CSRT DBG2 GTDT APIC PPTT SPCR acpi0: wakeup devices "BCM2849" at acpi0 not configured "BCM2835" at acpi0 not configured "BCM2854" at acpi0 not configured "ACPI0004" at acpi0 not configured xhci0 at acpi0 XHC0 addr 0x6/0x1000 irq 175, xHCI 1.0 usb0 at xhci0: USB revision 3.0 uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1 "ACPI0007" at acpi0 not configured "ACPI0007" at acpi0 not configured "ACPI0007" at acpi0 not configured "ACPI0007" at acpi0 not configured "ACPI0004" at acpi0 not configured "BCM2848" at acpi0 not configured "BCM2850" at acpi0 not configured "BCM2856" at acpi0 not configured "BCM2845" at acpi0 not configured "BCM2841" at acpi0 not configured "BCM2841" at acpi0 not configured "BCM2838" at acpi0 not configured "BCM2839" at acpi0 not configured "BCM2844" at acpi0 not configured pluart0 at acpi0 URT0 addr 0xfe201000/0x1000 irq 153: console "BCM2836" at acpi0 not configured "BCM2EA6" at acpi0 not configured "MSFT8000" at acpi0 not configured "BCM2847" at acpi0 not configured "BCM2855" at acpi0 not configured bse0 at acpi0 ETH0 addr 0xfd58/0x1 irq 189: address dc:a6:32:7f:dd:8e brgphy0 at bse0 phy 1: BCM54210E 10/100/1000baseT PHY, rev. 2 uhub1 at uhub0 port 1 configuration 1 interface 0 "VIA Labs USB2.0 Hub" rev 2.10/4.21 addr 2 umass0 at uhub1 port 1 configuration 1 interface 0 "SanDisk Cruzer Fit" rev 2.10/1.00 addr 3 umass0: using SCSI over Bulk-Only scsibus0 at umass0: 2 targets, initiator 0 sd0 at scsibus0 targ 1 lun 0: removable serial.07815571280726107280 sd0: 59976MB, 512 bytes/sector, 122830848 sectors vscsi0 at root scsibus1 at vscsi0: 256 targets softraid0 at root scsibus2 at softraid0: 256 targets bootfile: sd0a:/bsd boot device: sd0 root on sd0a (0cc470a97f4b5ef4.a) swap on sd0b dump on sd0b WARNING: clock lost 15 days WARNING: CHECK AND RESET THE DATE!
Re: userland clock_gettime proof of concept
> Date: Sun, 5 Jul 2020 20:31:19 +0300 > From: Paul Irofti > > On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: > > > > > > În 3 iulie 2020 17:55:25 EEST, Mark Kettenis a > > scris: > > >> Date: Fri, 3 Jul 2020 15:13:22 +0200 > > >> From: Robert Nagy > > >> > > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: > > >> > running on 38 of these, btw. > > >> > > >> been running with this on all my workstations and laptops and on 3 > > >build > > >> servers as well > > > > > >Are the issue that naddy@ saw solved? > > > > > >Did anybody do a *proper* test on anything besides amd64? Especially > > >on architectures where the optimized clock_gettime is *not* available? > > > > Yes and yes. > > So, can we go ahead with this? I wish your diff wouldn't add thos unneeded NULL and 0 initializers to files that don't need them. But ok kettenis@
Re: userland clock_gettime proof of concept
On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: > > > În 3 iulie 2020 17:55:25 EEST, Mark Kettenis a > scris: > >> Date: Fri, 3 Jul 2020 15:13:22 +0200 > >> From: Robert Nagy > >> > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: > >> > running on 38 of these, btw. > >> > >> been running with this on all my workstations and laptops and on 3 > >build > >> servers as well > > > >Are the issue that naddy@ saw solved? > > > >Did anybody do a *proper* test on anything besides amd64? Especially > >on architectures where the optimized clock_gettime is *not* available? > > Yes and yes. So, can we go ahead with this?
Re: userland clock_gettime proof of concept
On 2020-07-03, Scott Cheloha wrote: > Are we doing powerpc, arm64, and sparc64 separately? hppa can de done as well, but somebody with access to a machine needs to investigate/ensure that the S bit in the processor status register is appropriately set to allow userland access to the Interval Timer register. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Mark Kettenis wrote: > I did not add the LFENCE to tc_get_timecount() itself. We probably > should, but in the past some networking folks complained about slow > timecounters affecting network performance, so I wanted confirmation > that this didn't cause problems. I don't see how a more precise time can cause such a problem. If we are worried about any possibility of negative deltas, which might be put into unsigned variables, we should use lfence so it can't appear to go backwards..
Re: userland clock_gettime proof of concept
În 3 iulie 2020 20:57:52 EEST, Mark Kettenis a scris: >> Date: Fri, 3 Jul 2020 12:42:58 -0500 >> From: Scott Cheloha >> >> On Fri, Jul 03, 2020 at 02:34:20PM +0300, Paul Irofti wrote: >> > On 2020-07-03 00:40, Scott Cheloha wrote: >> > > On Fri, Jun 26, 2020 at 04:53:14PM +0300, Paul Irofti wrote: >> > > > On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: >> > > > > On 22/06/20 19:12 +0300, Paul Irofti wrote: >> > > > > > New iteration: >> > > > > > >> > > > > >- ps_timekeep should not coredump, pointed by deraadt@ >> > > > > >- set ps_timekeep to 0 before user uvm_map for >randomization >> > > > > >- map timekeep before fixup. confirmed by naddy@ that it >fixes NULL init >> > > > > >- initialize va. clarified by kettenis@ >> > > > > > >> > > > > > How's the magical max skew value research going? Do we have >a value yet? >> > > > > > >> > > > > > Paul >> > > > > >> > > > > I think we should pick 100 for now and then we can adjust it >later if needed. >> > > > > >> > > > > Of course this depends on kettenis' lfence diff so that amd >ryzen tsc is sane. >> > > > >> > > > I looked at dmesglog and the reported values are indeed small. >99 was >> > > > the highest on an Atom. I updated the diff to 100. I think we >can adapt >> > > > this as we get more reports (if ever). >> > > > >> > > > OK? >> > > >> > > One thing... >> > > >> > > > diff --git lib/libc/arch/amd64/gen/usertc.c >lib/libc/arch/amd64/gen/usertc.c >> > > > new file mode 100644 >> > > > index 000..56016c8eca1 >> > > > --- /dev/null >> > > > +++ lib/libc/arch/amd64/gen/usertc.c >> > > > @@ -0,0 +1,41 @@ >> > > > +/*$OpenBSD$ */ >> > > > +/* >> > > > + * Copyright (c) 2020 Paul Irofti >> > > > + * >> > > > + * Permission to use, copy, modify, and distribute this >software for any >> > > > + * purpose with or without fee is hereby granted, provided >that the above >> > > > + * copyright notice and this permission notice appear in all >copies. >> > > > + * >> > > > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS >ALL WARRANTIES >> > > > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED >WARRANTIES OF >> > > > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR >BE LIABLE FOR >> > > > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR >ANY DAMAGES >> > > > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, >WHETHER IN AN >> > > > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, >ARISING OUT OF >> > > > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS >SOFTWARE. >> > > > + */ >> > > > + >> > > > +#include >> > > > +#include >> > > > + >> > > > +static inline u_int >> > > > +rdtsc(void) >> > > > +{ >> > > > + uint32_t hi, lo; >> > > > + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); >> > > > + return ((uint64_t)lo)|(((uint64_t)hi)<<32); >> > > > +} >> > > >> > > We need to lfence this. >> > >> > In userland too? Why? >> >> I was under the impression kettenis@ had added an lfence to the >kernel >> TSC's tc_get_timecount(), but I was mistaken. >> >> We can deal with that separately. > >I just committed a diff that adds the LFENCE everywhere where we are >measuring a time interval to do some sort of calibration. I think >that should get the skew down under 100 cycles on all reasonable >amd64 machines. Thank you very much for that, Mark! >I did not add the LFENCE to tc_get_timecount() itself. We probably >should, but in the past some networking folks complained about slow >timecounters affecting network performance, so I wanted confirmation >that this didn't cause problems. Right. I think we should decide that separately from this diff. As it is the diff is well tested. Paul
Re: userland clock_gettime proof of concept
> Date: Fri, 3 Jul 2020 12:42:58 -0500 > From: Scott Cheloha > > On Fri, Jul 03, 2020 at 02:34:20PM +0300, Paul Irofti wrote: > > On 2020-07-03 00:40, Scott Cheloha wrote: > > > On Fri, Jun 26, 2020 at 04:53:14PM +0300, Paul Irofti wrote: > > > > On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: > > > > > On 22/06/20 19:12 +0300, Paul Irofti wrote: > > > > > > New iteration: > > > > > > > > > > > >- ps_timekeep should not coredump, pointed by deraadt@ > > > > > >- set ps_timekeep to 0 before user uvm_map for randomization > > > > > >- map timekeep before fixup. confirmed by naddy@ that it fixes > > > > > > NULL init > > > > > >- initialize va. clarified by kettenis@ > > > > > > > > > > > > How's the magical max skew value research going? Do we have a value > > > > > > yet? > > > > > > > > > > > > Paul > > > > > > > > > > I think we should pick 100 for now and then we can adjust it later if > > > > > needed. > > > > > > > > > > Of course this depends on kettenis' lfence diff so that amd ryzen tsc > > > > > is sane. > > > > > > > > I looked at dmesglog and the reported values are indeed small. 99 was > > > > the highest on an Atom. I updated the diff to 100. I think we can adapt > > > > this as we get more reports (if ever). > > > > > > > > OK? > > > > > > One thing... > > > > > > > diff --git lib/libc/arch/amd64/gen/usertc.c > > > > lib/libc/arch/amd64/gen/usertc.c > > > > new file mode 100644 > > > > index 000..56016c8eca1 > > > > --- /dev/null > > > > +++ lib/libc/arch/amd64/gen/usertc.c > > > > @@ -0,0 +1,41 @@ > > > > +/* $OpenBSD$ */ > > > > +/* > > > > + * Copyright (c) 2020 Paul Irofti > > > > + * > > > > + * Permission to use, copy, modify, and distribute this software for > > > > any > > > > + * purpose with or without fee is hereby granted, provided that the > > > > above > > > > + * copyright notice and this permission notice appear in all copies. > > > > + * > > > > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL > > > > WARRANTIES > > > > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > > > > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE > > > > FOR > > > > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY > > > > DAMAGES > > > > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN > > > > AN > > > > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING > > > > OUT OF > > > > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > > > > + */ > > > > + > > > > +#include > > > > +#include > > > > + > > > > +static inline u_int > > > > +rdtsc(void) > > > > +{ > > > > + uint32_t hi, lo; > > > > + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); > > > > + return ((uint64_t)lo)|(((uint64_t)hi)<<32); > > > > +} > > > > > > We need to lfence this. > > > > In userland too? Why? > > I was under the impression kettenis@ had added an lfence to the kernel > TSC's tc_get_timecount(), but I was mistaken. > > We can deal with that separately. I just committed a diff that adds the LFENCE everywhere where we are measuring a time interval to do some sort of calibration. I think that should get the skew down under 100 cycles on all reasonable amd64 machines. I did not add the LFENCE to tc_get_timecount() itself. We probably should, but in the past some networking folks complained about slow timecounters affecting network performance, so I wanted confirmation that this didn't cause problems.
Re: userland clock_gettime proof of concept
Scott Cheloha wrote: > I was under the impression kettenis@ had added an lfence to the kernel > TSC's tc_get_timecount(), but I was mistaken. > > We can deal with that separately. the question was asked, but I guess not answered definitively I think we should *always* lfence, because in the end the values do get compared against others which are lfenced, and it would be a shame if delta was ever negative.
Re: userland clock_gettime proof of concept
On Fri, Jul 03, 2020 at 02:34:20PM +0300, Paul Irofti wrote: > On 2020-07-03 00:40, Scott Cheloha wrote: > > On Fri, Jun 26, 2020 at 04:53:14PM +0300, Paul Irofti wrote: > > > On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: > > > > On 22/06/20 19:12 +0300, Paul Irofti wrote: > > > > > New iteration: > > > > > > > > > >- ps_timekeep should not coredump, pointed by deraadt@ > > > > >- set ps_timekeep to 0 before user uvm_map for randomization > > > > >- map timekeep before fixup. confirmed by naddy@ that it fixes > > > > > NULL init > > > > >- initialize va. clarified by kettenis@ > > > > > > > > > > How's the magical max skew value research going? Do we have a value > > > > > yet? > > > > > > > > > > Paul > > > > > > > > I think we should pick 100 for now and then we can adjust it later if > > > > needed. > > > > > > > > Of course this depends on kettenis' lfence diff so that amd ryzen tsc > > > > is sane. > > > > > > I looked at dmesglog and the reported values are indeed small. 99 was > > > the highest on an Atom. I updated the diff to 100. I think we can adapt > > > this as we get more reports (if ever). > > > > > > OK? > > > > One thing... > > > > > diff --git lib/libc/arch/amd64/gen/usertc.c > > > lib/libc/arch/amd64/gen/usertc.c > > > new file mode 100644 > > > index 000..56016c8eca1 > > > --- /dev/null > > > +++ lib/libc/arch/amd64/gen/usertc.c > > > @@ -0,0 +1,41 @@ > > > +/* $OpenBSD$ */ > > > +/* > > > + * Copyright (c) 2020 Paul Irofti > > > + * > > > + * Permission to use, copy, modify, and distribute this software for any > > > + * purpose with or without fee is hereby granted, provided that the above > > > + * copyright notice and this permission notice appear in all copies. > > > + * > > > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL > > > WARRANTIES > > > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > > > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE > > > FOR > > > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > > > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > > > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT > > > OF > > > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > > > + */ > > > + > > > +#include > > > +#include > > > + > > > +static inline u_int > > > +rdtsc(void) > > > +{ > > > + uint32_t hi, lo; > > > + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); > > > + return ((uint64_t)lo)|(((uint64_t)hi)<<32); > > > +} > > > > We need to lfence this. > > In userland too? Why? I was under the impression kettenis@ had added an lfence to the kernel TSC's tc_get_timecount(), but I was mistaken. We can deal with that separately. > Anyway. I think this diff should be committed. It is too long since we > have been dancing around it and these sorts of comments can always be > addressed in tree.
Re: userland clock_gettime proof of concept
În 3 iulie 2020 18:45:29 EEST, Scott Cheloha a scris: >On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: >> >> >> ??n 3 iulie 2020 17:55:25 EEST, Mark Kettenis > a scris: >> >> Date: Fri, 3 Jul 2020 15:13:22 +0200 >> >> From: Robert Nagy >> >> >> >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: >> >> > running on 38 of these, btw. >> >> >> >> been running with this on all my workstations and laptops and on 3 >> >build >> >> servers as well >> > >> >Are the issue that naddy@ saw solved? >> > >> >Did anybody do a *proper* test on anything besides amd64? >Especially >> >on architectures where the optimized clock_gettime is *not* >available? >> >> Yes and yes. > >I don't see any userland drivers for anything but amd64 in your diff. > >Are we doing powerpc, arm64, and sparc64 separately? Search the thread. Others have written them, yes. And kettenis was asking about the oppsite: architectures without them.
Re: userland clock_gettime proof of concept
On 2020-07-03, Mark Kettenis wrote: > Are the issue that naddy@ saw solved? Yes, they were understood and fixed. I'll defer to you and Scott regarding the TSC synchronization issues; aside from that I'm okaying the diff. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote: > > > ??n 3 iulie 2020 17:55:25 EEST, Mark Kettenis a > scris: > >> Date: Fri, 3 Jul 2020 15:13:22 +0200 > >> From: Robert Nagy > >> > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: > >> > running on 38 of these, btw. > >> > >> been running with this on all my workstations and laptops and on 3 > >build > >> servers as well > > > >Are the issue that naddy@ saw solved? > > > >Did anybody do a *proper* test on anything besides amd64? Especially > >on architectures where the optimized clock_gettime is *not* available? > > Yes and yes. I don't see any userland drivers for anything but amd64 in your diff. Are we doing powerpc, arm64, and sparc64 separately?
Re: userland clock_gettime proof of concept
În 3 iulie 2020 17:55:25 EEST, Mark Kettenis a scris: >> Date: Fri, 3 Jul 2020 15:13:22 +0200 >> From: Robert Nagy >> >> On 02/07/20 00:31 +0100, Stuart Henderson wrote: >> > running on 38 of these, btw. >> >> been running with this on all my workstations and laptops and on 3 >build >> servers as well > >Are the issue that naddy@ saw solved? > >Did anybody do a *proper* test on anything besides amd64? Especially >on architectures where the optimized clock_gettime is *not* available? Yes and yes.
Re: userland clock_gettime proof of concept
> Date: Fri, 3 Jul 2020 15:13:22 +0200 > From: Robert Nagy > > On 02/07/20 00:31 +0100, Stuart Henderson wrote: > > running on 38 of these, btw. > > been running with this on all my workstations and laptops and on 3 build > servers as well Are the issue that naddy@ saw solved? Did anybody do a *proper* test on anything besides amd64? Especially on architectures where the optimized clock_gettime is *not* available?
Re: userland clock_gettime proof of concept
On 02/07/20 00:31 +0100, Stuart Henderson wrote: > running on 38 of these, btw. been running with this on all my workstations and laptops and on 3 build servers as well
Re: userland clock_gettime proof of concept
On 2020-07-03 00:40, Scott Cheloha wrote: On Fri, Jun 26, 2020 at 04:53:14PM +0300, Paul Irofti wrote: On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: On 22/06/20 19:12 +0300, Paul Irofti wrote: New iteration: - ps_timekeep should not coredump, pointed by deraadt@ - set ps_timekeep to 0 before user uvm_map for randomization - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init - initialize va. clarified by kettenis@ How's the magical max skew value research going? Do we have a value yet? Paul I think we should pick 100 for now and then we can adjust it later if needed. Of course this depends on kettenis' lfence diff so that amd ryzen tsc is sane. I looked at dmesglog and the reported values are indeed small. 99 was the highest on an Atom. I updated the diff to 100. I think we can adapt this as we get more reports (if ever). OK? One thing... diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +static inline u_int +rdtsc(void) +{ + uint32_t hi, lo; + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); + return ((uint64_t)lo)|(((uint64_t)hi)<<32); +} We need to lfence this. In userland too? Why? Anyway. I think this diff should be committed. It is too long since we have been dancing around it and these sorts of comments can always be addressed in tree. Paul
Re: userland clock_gettime proof of concept
On Fri, Jun 26, 2020 at 04:53:14PM +0300, Paul Irofti wrote: > On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: > > On 22/06/20 19:12 +0300, Paul Irofti wrote: > > > New iteration: > > > > > > - ps_timekeep should not coredump, pointed by deraadt@ > > > - set ps_timekeep to 0 before user uvm_map for randomization > > > - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init > > > - initialize va. clarified by kettenis@ > > > > > > How's the magical max skew value research going? Do we have a value yet? > > > > > > Paul > > > > I think we should pick 100 for now and then we can adjust it later if > > needed. > > > > Of course this depends on kettenis' lfence diff so that amd ryzen tsc is > > sane. > > I looked at dmesglog and the reported values are indeed small. 99 was > the highest on an Atom. I updated the diff to 100. I think we can adapt > this as we get more reports (if ever). > > OK? One thing... > diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c > new file mode 100644 > index 000..56016c8eca1 > --- /dev/null > +++ lib/libc/arch/amd64/gen/usertc.c > @@ -0,0 +1,41 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#include > +#include > + > +static inline u_int > +rdtsc(void) > +{ > + uint32_t hi, lo; > + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); > + return ((uint64_t)lo)|(((uint64_t)hi)<<32); > +} We need to lfence this.
Re: userland clock_gettime proof of concept
running on 38 of these, btw. OpenBSD 6.7-current (GENERIC.MP) #0: Sat Jun 27 21:15:58 BST 2020 sthen@...:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4169592832 (3976MB) avail mem = 4028198912 (3841MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xec170 (83 entries) bios0: vendor Intel Corp. version "WYLPT10H.86A.0044.2016.1214.1710" date 12/14/2016 bios0: Intel Corporation D34010WYK acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT SSDT SSDT MCFG HPET SSDT SSDT DMAR CSRT acpi0: wakeup devices RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP04(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S4) EHC2(S4) XHC_(S4) HDEF(S4) PEG0(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.34 MHz, 06-45-01 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: TSC skew=0 observed drift=0 cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.09 MHz, 06-45-01 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: TSC skew=-22 observed drift=0 cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 1 (application processor) cpu2: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: TSC skew=-8 observed drift=0 cpu2: smt 1, core 0, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: TSC skew=-22 observed drift=0 cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 40 pins acpimcfg0 at acpi0 acpimcfg0: addr 0xf800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (RP01) acpiprt2 at acpi0: bus 2 (RP04) acpiprt3 at acpi0: bus -1 (PEG0) acpiec0 at acpi0: not present acpicpu0 at acpi0: C2(500@67 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu1 at acpi0: C2(500@67 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu2 at acpi0: C2(500@67 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu3 at acpi0: C2(500@67 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpipwrres0 at acpi0: FN00, resource for FAN0 acpipwrres1 at acpi0: FN01, resource for FAN1 acpipwrres2 at acpi0: FN02, resource for FAN2 acpipwrres3 at acpi0: FN03, resource for FAN3 acpipwrres4 at acpi0: FN04, resource for FAN4 acpitz0 at acpi0: critical temperature is 105 degC acpitz1 at acpi0: critical temperature is 105 degC acpipci0 at acpi0 PCI0: 0x0010 0x0011 0x extent `acpipci0 pcibus' (0x0 - 0xff), flags=0 0x3f - 0xff extent `acpipci0 pciio' (0x0 - 0x), flags=0 0xcf8 - 0xcff 0x1 - 0x extent `acpipci0 pcimem' (0x0 - 0x), flags=0 0x0 - 0x9 0xc - 0xc 0xe8000 - 0xdf1f 0xfeb0 - 0x acpicmos0 at acpi0 "NTN0530"
Re: userland clock_gettime proof of concept
On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote: > On 22/06/20 19:12 +0300, Paul Irofti wrote: > > New iteration: > > > > - ps_timekeep should not coredump, pointed by deraadt@ > > - set ps_timekeep to 0 before user uvm_map for randomization > > - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init > > - initialize va. clarified by kettenis@ > > > > How's the magical max skew value research going? Do we have a value yet? > > > > Paul > > I think we should pick 100 for now and then we can adjust it later if needed. > > Of course this depends on kettenis' lfence diff so that amd ryzen tsc is sane. I looked at dmesglog and the reported values are indeed small. 99 was the highest on an Atom. I updated the diff to 100. I think we can adapt this as we get more reports (if ever). OK? diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission no
Re: userland clock_gettime proof of concept
On Mon, 22 Jun 2020 19:12:22 +0300 Paul Irofti wrote: > New iteration: > > - ps_timekeep should not coredump, pointed by deraadt@ > - set ps_timekeep to 0 before user uvm_map for randomization > - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init > - initialize va. clarified by kettenis@ Here's macppc again. My macppc isn't using your newest diff but does now need to define TC_TB in . The /sys/arch/powerpc/include/timetc.h in your diff never gets used, because there is no #include . On macppc, uname -m => macppc and uname -p => powerpcare different, and #include is /sys/arch/macppc/include/timetc.h. I suspect that is /sys/arch/$i/include/timetc.h if and only if /sys/arch/$i/compile exists. 10 days ago, naddy said, "You only need the lower register." That is correct, so this diff also stops using mftbu (the higher register). --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 16:42:36 2020 +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun 24 16:46:00 2020 @@ -18,4 +18,17 @@ #include #include -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; +int +tc_get_timecount(struct timekeep *tk, u_int *tc) +{ + u_int tb; + + if (tk->tk_user != TC_TB) + return -1; + + asm volatile("mftb %0" : "=r"(tb)); + *tc = tb; + return 0; +} +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc) + = tc_get_timecount; --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 16:36:03 2020 +++ sys/arch/macppc/include/timetc.hWed Jun 24 16:37:47 2020 @@ -18,6 +18,7 @@ #ifndef _MACHINE_TIMETC_H_ #define _MACHINE_TIMETC_H_ -#defineTC_LAST 0 +#defineTC_TB 1 +#defineTC_LAST 2 #endif /* _MACHINE_TIMETC_H_ */ --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 2020 +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 2020 @@ -57,7 +57,7 @@ static int32_t ticks_per_intr; static struct timecounter tb_timecounter = { - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, TC_TB }; /* calibrate the timecounter frequency for the listed models */
Re: userland clock_gettime proof of concept
On 22/06/20 19:12 +0300, Paul Irofti wrote: > New iteration: > > - ps_timekeep should not coredump, pointed by deraadt@ > - set ps_timekeep to 0 before user uvm_map for randomization > - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init > - initialize va. clarified by kettenis@ > > How's the magical max skew value research going? Do we have a value yet? > > Paul I think we should pick 100 for now and then we can adjust it later if needed. Of course this depends on kettenis' lfence diff so that amd ryzen tsc is sane.
Re: userland clock_gettime proof of concept
New iteration: - ps_timekeep should not coredump, pointed by deraadt@ - set ps_timekeep to 0 before user uvm_map for randomization - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init - initialize va. clarified by kettenis@ How's the magical max skew value research going? Do we have a value yet? Paul diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION W
Re: userland clock_gettime proof of concept
Christian Weisgerber wrote: > Christian Weisgerber: > > > If I move > > > > vaddr_t ps_timekeep;/* User pointer to timekeep */ > > > > up into the zeroed area, I get a properly randomized _timekeep in > > userland. > > Also note that exec_sigcode_map() has this > > pr->ps_sigcode = 0; /* no hint */ > uao_reference(e->e_sigobject); > if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_sigcode, round_page(sz), > > I don't know if we want to > * explicitly set ps_timekeep to 0 in exec_timekeep_map(), or > * move it into the zeroed area, which we should also do with ps_sigcode > then. Placing it in the zero'd area probably needs some careful consideration and testing in relationship to MD signal delivery, we would not want tf->tf_pr = (int)p->p_p->ps_sigcode to become a pointer to userland NULL, resulting in a fault, which sends a signal and Looking at ps_sigcode use in the kernel, I also find a special case for it in uvm_should_coredump(). The same "avoid dumping" logic should probably be there for timekeep also.
Re: userland clock_gettime proof of concept
On Mon, Jun 22, 2020 at 09:46:13AM -0600, Theo de Raadt wrote: > Christian Weisgerber wrote: > > > Paul Irofti: > > > > > 683 /* map the process's timekeep page */ > > > 684 if (exec_timekeep_map(pr)) > > > 685 goto free_pack_abort; > > > 686 /* setup new registers and do misc. setup. */ > > > 687 if (pack.ep_emul->e_fixup != NULL) { > > > 688 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) > > > 689 goto free_pack_abort; > > > 690 } > > > > Yes, with this init(8) gets a proper _timekeep instead of 0x0. > > > > For randomization of the userland page... > > > > + if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_timekeep, > > round_page(timekeep_sz), > > > > ... ps_timekeep need to be 0 here. At the moment, it inherits the > > value from the parent process in fork(). > > > > In struct process in sys/proc.h, there is this: > > > > /* The following fields are all zeroed upon creation in process_new. */ > > ... > > /* End area that is zeroed on creation. */ > > > > If I move > > > > vaddr_t ps_timekeep;/* User pointer to timekeep */ > > > > up into the zeroed area, I get a properly randomized _timekeep in > > userland. > > Right. > > > BTW, why is this important? One could say this does not need to > be randomized. It has no secret. But a significant downside occurs > with visible effects. > > If that 1 page is always in the same place, then address-space > randomizated mappings of future objects will not be able to place an > object over that one page. > > The address space is significantly less randomized as soon as it > contains one fixed object. Less randomized in a severe way impacting > security. Fully agree. I am going to send a new diff out with all of these included.
Re: userland clock_gettime proof of concept
On Mon, Jun 22, 2020 at 05:35:48PM +0200, Christian Weisgerber wrote: > Paul Irofti: > > > 683 /* map the process's timekeep page */ > > 684 if (exec_timekeep_map(pr)) > > 685 goto free_pack_abort; > > 686 /* setup new registers and do misc. setup. */ > > 687 if (pack.ep_emul->e_fixup != NULL) { > > 688 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) > > 689 goto free_pack_abort; > > 690 } > > Yes, with this init(8) gets a proper _timekeep instead of 0x0. > > For randomization of the userland page... > > + if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_timekeep, > round_page(timekeep_sz), > > ... ps_timekeep need to be 0 here. At the moment, it inherits the > value from the parent process in fork(). > > In struct process in sys/proc.h, there is this: > > /* The following fields are all zeroed upon creation in process_new. */ > ... > /* End area that is zeroed on creation. */ > > If I move > > vaddr_t ps_timekeep;/* User pointer to timekeep */ > > up into the zeroed area, I get a properly randomized _timekeep in > userland. Nice, I bet the other mapping suffers from the same problem, checking now with what Theo said.
Re: userland clock_gettime proof of concept
Christian Weisgerber wrote: > Paul Irofti: > > > 683 /* map the process's timekeep page */ > > 684 if (exec_timekeep_map(pr)) > > 685 goto free_pack_abort; > > 686 /* setup new registers and do misc. setup. */ > > 687 if (pack.ep_emul->e_fixup != NULL) { > > 688 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) > > 689 goto free_pack_abort; > > 690 } > > Yes, with this init(8) gets a proper _timekeep instead of 0x0. > > For randomization of the userland page... > > + if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_timekeep, > round_page(timekeep_sz), > > ... ps_timekeep need to be 0 here. At the moment, it inherits the > value from the parent process in fork(). > > In struct process in sys/proc.h, there is this: > > /* The following fields are all zeroed upon creation in process_new. */ > ... > /* End area that is zeroed on creation. */ > > If I move > > vaddr_t ps_timekeep;/* User pointer to timekeep */ > > up into the zeroed area, I get a properly randomized _timekeep in > userland. Right. BTW, why is this important? One could say this does not need to be randomized. It has no secret. But a significant downside occurs with visible effects. If that 1 page is always in the same place, then address-space randomizated mappings of future objects will not be able to place an object over that one page. The address space is significantly less randomized as soon as it contains one fixed object. Less randomized in a severe way impacting security.
Re: userland clock_gettime proof of concept
Christian Weisgerber: > If I move > > vaddr_t ps_timekeep;/* User pointer to timekeep */ > > up into the zeroed area, I get a properly randomized _timekeep in > userland. Also note that exec_sigcode_map() has this pr->ps_sigcode = 0; /* no hint */ uao_reference(e->e_sigobject); if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_sigcode, round_page(sz), I don't know if we want to * explicitly set ps_timekeep to 0 in exec_timekeep_map(), or * move it into the zeroed area, which we should also do with ps_sigcode then. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti: > 683 /* map the process's timekeep page */ > 684 if (exec_timekeep_map(pr)) > 685 goto free_pack_abort; > 686 /* setup new registers and do misc. setup. */ > 687 if (pack.ep_emul->e_fixup != NULL) { > 688 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) > 689 goto free_pack_abort; > 690 } Yes, with this init(8) gets a proper _timekeep instead of 0x0. For randomization of the userland page... + if (uvm_map(&pr->ps_vmspace->vm_map, &pr->ps_timekeep, round_page(timekeep_sz), ... ps_timekeep need to be 0 here. At the moment, it inherits the value from the parent process in fork(). In struct process in sys/proc.h, there is this: /* The following fields are all zeroed upon creation in process_new. */ ... /* End area that is zeroed on creation. */ If I move vaddr_t ps_timekeep;/* User pointer to timekeep */ up into the zeroed area, I get a properly randomized _timekeep in userland. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti wrote: > So I don't know why the address is not randomized, but I bet if I print > pr->ps_sigcode somehow from userland, it will be the same. I just checked that. 60 out of 68 processes have a unique UVA for sigtramp, the other 8 processes are fork's without execve.
Re: userland clock_gettime proof of concept
Paul Irofti wrote: > So I don't know why the address is not randomized, but I bet if I print > pr->ps_sigcode somehow from userland, it will be the same. echo kern.allowkmem=1 >> /etc/sysctl.conf reboot Then procmap can be run on all processes, to see what VA the sigtramp and timekeep land at. It should be an independent random VA in every execve'd process.
Re: userland clock_gettime proof of concept
On Sun, Jun 21, 2020 at 05:42:55PM -0600, Theo de Raadt wrote: > Paul Irofti wrote: > > > > > > > > > ??n 22 iunie 2020 01:26:16 EEST, Christian Weisgerber > > a scris: > > >Christian Weisgerber: > > > > > >> I tweaked the patch locally to make _timekeep a visible global > > >> symbol in libc. > > >> > > >> Printing its value has revealed two issues: > > >> > > >> * The timekeep page is mapped to the same address for every process. > > >> It changes across reboots, but once running, it's always the same. > > >> kettenis suggested > > >> - vaddr_t va; > > >> + vaddr_t va = 0; > > >> in exec_timekeep_map(), but that doesn't make a difference. > > > > > >But that's the kernel mapping, and my observation concerns the > > >userland mapping. So based on this, I moved ps_timekeep up into > > >the fields of struct process that are zeroed on creation. > > >With that, _timekeep is always 0 for all processes. :-/ > > > > > > I don't understand what problem you are trying to solve. Is it that > > timekeep is the same? That's because we create only one page and the > > address gets copied on fork. The diff was not designed to have timekeep > > zero'd on every process so it doesn't account for it. > > > And I think you aren't listening. > > He is saying it is at the same VA in *every* userland process. Since most > processes do use this little system call execve, it is implausible for it > to be at the same place, just like it is implausible for the signal tramp > to be same place, or ld.so, or libc. The code we are talking about is only called from inside this little system call execve by exec_timekeep_map() and fixup(). 683 /* setup new registers and do misc. setup. */ 684 if (pack.ep_emul->e_fixup != NULL) { 685 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) 686 goto free_pack_abort; 687 } ... 694 /* map the process's signal trampoline code */ 695 if (exec_sigcode_map(pr, pack.ep_emul)) 696 goto free_pack_abort; 697 /* map the process's timekeep page */ 698 if (exec_timekeep_map(pr)) 699 goto free_pack_abort; The timekeep map code is doing the same thing as the sigcode map. 880 int 881 exec_timekeep_map(struct process *pr) 882 { 883 size_t timekeep_sz = sizeof(struct timekeep); 884 885 /* 886 * Similar to the sigcode object, except that there is a single 887 * timekeep object, and not one per emulation. 888 */ 889 if (timekeep_object == NULL) { The timekeep_object is checked if allocated and if not it does a uvm_map(kernel_map). The timekeep_object is global so the if condition is only true once. Then a second call to uvm_map() sends it to the process space. Fixup is called before this, which I think is wrong now. 863 a->au_id = AUX_openbsd_timekeep; 864 a->au_v = p->p_p->ps_timekeep; 865 a++; It should be map-fixup rather than fixup-map, right? But even reversing the order leads to the same va address. 683 /* map the process's timekeep page */ 684 if (exec_timekeep_map(pr)) 685 goto free_pack_abort; 686 /* setup new registers and do misc. setup. */ 687 if (pack.ep_emul->e_fixup != NULL) { 688 if ((*pack.ep_emul->e_fixup)(p, &pack) != 0) 689 goto free_pack_abort; 690 } So I don't know why the address is not randomized, but I bet if I print pr->ps_sigcode somehow from userland, it will be the same. Paul
Re: userland clock_gettime proof of concept
> Date: Mon, 22 Jun 2020 13:53:25 +0300 > From: Paul Irofti > > > Still uses uint instead of u_int in places. Still has the pointless > > extra NULL and 0 for timecounters in files that are otherwise > > If you don't like uint, then let's fix what's in the tree in amd64 > (which is how uint got used in my diff too). OK? That is correct (matches the type used in . ok kettenis@ > diff --git sys/arch/amd64/amd64/tsc.c sys/arch/amd64/amd64/tsc.c > index 7a1dcb4ad75..25c98180852 100644 > --- sys/arch/amd64/amd64/tsc.c > +++ sys/arch/amd64/amd64/tsc.c > @@ -42,7 +42,7 @@ int64_t tsc_drift_observed; > volatile int64_t tsc_sync_val; > volatile struct cpu_info *tsc_sync_cpu; > > -uint tsc_get_timecount(struct timecounter *tc); > +u_inttsc_get_timecount(struct timecounter *tc); > > #include "lapic.h" > #if NLAPIC > 0 > @@ -207,7 +207,7 @@ cpu_recalibrate_tsc(struct timecounter *tc) > calibrate_tsc_freq(); > } > > -uint > +u_int > tsc_get_timecount(struct timecounter *tc) > { > return rdtsc() + curcpu()->ci_tsc_skew; >
Re: userland clock_gettime proof of concept
> Still uses uint instead of u_int in places. Still has the pointless > extra NULL and 0 for timecounters in files that are otherwise If you don't like uint, then let's fix what's in the tree in amd64 (which is how uint got used in my diff too). OK? diff --git sys/arch/amd64/amd64/tsc.c sys/arch/amd64/amd64/tsc.c index 7a1dcb4ad75..25c98180852 100644 --- sys/arch/amd64/amd64/tsc.c +++ sys/arch/amd64/amd64/tsc.c @@ -42,7 +42,7 @@ int64_t tsc_drift_observed; volatile int64_t tsc_sync_val; volatile struct cpu_info *tsc_sync_cpu; -uint tsc_get_timecount(struct timecounter *tc); +u_int tsc_get_timecount(struct timecounter *tc); #include "lapic.h" #if NLAPIC > 0 @@ -207,7 +207,7 @@ cpu_recalibrate_tsc(struct timecounter *tc) calibrate_tsc_freq(); } -uint +u_int tsc_get_timecount(struct timecounter *tc) { return rdtsc() + curcpu()->ci_tsc_skew;
Re: userland clock_gettime proof of concept
On Mon, Jun 22, 2020 at 01:27:22AM +0200, Mark Kettenis wrote: > > Date: Mon, 22 Jun 2020 02:06:39 +0300 > > From: Paul Irofti > > > > În 22 iunie 2020 00:15:59 EEST, Christian Weisgerber a > > scris: > > >Paul Irofti: > > > > > >[Unrelated, just to mark where we're at] > > >> Right. Just reproduced it here. This moves the check at the top so > > >that > > >> each CPU checks its own skew and disables tc_user if necessary. > > > > > >I tweaked the patch locally to make _timekeep a visible global > > >symbol in libc. > > > > > >Printing its value has revealed two issues: > > > > > >* The timekeep page is mapped to the same address for every process. > > > It changes across reboots, but once running, it's always the same. > > > kettenis suggested > > > - vaddr_t va; > > > + vaddr_t va = 0; > > > in exec_timekeep_map(), but that doesn't make a difference. > > > > The va is set a few lines down the line. No point in > > initialization. This is identical behavior to the emul mapping > > before timekeep. > > Well, uvm_map() picks a virtual address based on the value of va that > is passed in. If it is zero, it picks a random address. If not, it > uses the value as a hint and tries to pick something nearby. So > passing in stack garbage is a bad thing. But uoffset=0 means it is not UVM_UNKNOWN_OFFSET (-1) and we have a non-NULL uobj, so my understanding is that the va address is ignored in this case. So it does not need to be initialized. Right? if (uvm_map(kernel_map, &va, round_page(timekeep_sz), timekeep_object, 0, 0, UVM_MAPFLAG(PROT_READ | PROT_WRITE, PROT_READ | PROT_WRITE, MAP_INHERIT_SHARE, MADV_RANDOM, 0))) { None the less, I added va=0 in my diff. But I think it is pointless. If you disagree, then do you OK the following diff? diff --git sys/kern/kern_exec.c sys/kern/kern_exec.c index 20480c2fc28..2b2b4f15222 100644 --- sys/kern/kern_exec.c +++ sys/kern/kern_exec.c @@ -828,7 +828,7 @@ exec_sigcode_map(struct process *pr, struct emul *e) extern int sigfillsiz; extern u_char sigfill[]; size_t off; - vaddr_t va; + vaddr_t va = 0; int r; e->e_sigobject = uao_create(sz, 0);
Re: userland clock_gettime proof of concept
Paul Irofti wrote: > > > > Ãn 22 iunie 2020 01:26:16 EEST, Christian Weisgerber a > scris: > >Christian Weisgerber: > > > >> I tweaked the patch locally to make _timekeep a visible global > >> symbol in libc. > >> > >> Printing its value has revealed two issues: > >> > >> * The timekeep page is mapped to the same address for every process. > >> It changes across reboots, but once running, it's always the same. > >> kettenis suggested > >> - vaddr_t va; > >> + vaddr_t va = 0; > >> in exec_timekeep_map(), but that doesn't make a difference. > > > >But that's the kernel mapping, and my observation concerns the > >userland mapping. So based on this, I moved ps_timekeep up into > >the fields of struct process that are zeroed on creation. > >With that, _timekeep is always 0 for all processes. :-/ > > > I don't understand what problem you are trying to solve. Is it that timekeep > is the same? That's because we create only one page and the address gets > copied on fork. The diff was not designed to have timekeep zero'd on every > process so it doesn't account for it. And I think you aren't listening. He is saying it is at the same VA in *every* userland process. Since most processes do use this little system call execve, it is implausible for it to be at the same place, just like it is implausible for the signal tramp to be same place, or ld.so, or libc.
Re: userland clock_gettime proof of concept
> Date: Mon, 22 Jun 2020 02:06:39 +0300 > From: Paul Irofti > > În 22 iunie 2020 00:15:59 EEST, Christian Weisgerber a > scris: > >Paul Irofti: > > > >[Unrelated, just to mark where we're at] > >> Right. Just reproduced it here. This moves the check at the top so > >that > >> each CPU checks its own skew and disables tc_user if necessary. > > > >I tweaked the patch locally to make _timekeep a visible global > >symbol in libc. > > > >Printing its value has revealed two issues: > > > >* The timekeep page is mapped to the same address for every process. > > It changes across reboots, but once running, it's always the same. > > kettenis suggested > > - vaddr_t va; > > + vaddr_t va = 0; > > in exec_timekeep_map(), but that doesn't make a difference. > > The va is set a few lines down the line. No point in > initialization. This is identical behavior to the emul mapping > before timekeep. Well, uvm_map() picks a virtual address based on the value of va that is passed in. If it is zero, it picks a random address. If not, it uses the value as a hint and tries to pick something nearby. So passing in stack garbage is a bad thing. > > >* I'm indeed seeing init(8) with _timekeep == NULL. > > Probably because it is the first process? If you want to follow this > read the kernel init bits and the syscall exec bits. Possible. The way process 1 is created is a bit of a hack. Anyway, _timekeep = NULL should not be a problem; the code should fall back on using system calls in that case.
Re: userland clock_gettime proof of concept
În 22 iunie 2020 01:26:16 EEST, Christian Weisgerber a scris: >Christian Weisgerber: > >> I tweaked the patch locally to make _timekeep a visible global >> symbol in libc. >> >> Printing its value has revealed two issues: >> >> * The timekeep page is mapped to the same address for every process. >> It changes across reboots, but once running, it's always the same. >> kettenis suggested >> - vaddr_t va; >> + vaddr_t va = 0; >> in exec_timekeep_map(), but that doesn't make a difference. > >But that's the kernel mapping, and my observation concerns the >userland mapping. So based on this, I moved ps_timekeep up into >the fields of struct process that are zeroed on creation. >With that, _timekeep is always 0 for all processes. :-/ I don't understand what problem you are trying to solve. Is it that timekeep is the same? That's because we create only one page and the address gets copied on fork. The diff was not designed to have timekeep zero'd on every process so it doesn't account for it.
Re: userland clock_gettime proof of concept
În 22 iunie 2020 00:15:59 EEST, Christian Weisgerber a scris: >Paul Irofti: > >[Unrelated, just to mark where we're at] >> Right. Just reproduced it here. This moves the check at the top so >that >> each CPU checks its own skew and disables tc_user if necessary. > >I tweaked the patch locally to make _timekeep a visible global >symbol in libc. > >Printing its value has revealed two issues: > >* The timekeep page is mapped to the same address for every process. > It changes across reboots, but once running, it's always the same. > kettenis suggested > - vaddr_t va; > + vaddr_t va = 0; > in exec_timekeep_map(), but that doesn't make a difference. The va is set a few lines down the line. No point in initialization. This is identical behavior to the emul mapping before timekeep. >* I'm indeed seeing init(8) with _timekeep == NULL. Probably because it is the first process? If you want to follow this read the kernel init bits and the syscall exec bits.
Re: userland clock_gettime proof of concept
Christian Weisgerber: > I tweaked the patch locally to make _timekeep a visible global > symbol in libc. > > Printing its value has revealed two issues: > > * The timekeep page is mapped to the same address for every process. > It changes across reboots, but once running, it's always the same. > kettenis suggested > - vaddr_t va; > + vaddr_t va = 0; > in exec_timekeep_map(), but that doesn't make a difference. But that's the kernel mapping, and my observation concerns the userland mapping. So based on this, I moved ps_timekeep up into the fields of struct process that are zeroed on creation. With that, _timekeep is always 0 for all processes. :-/ -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti: [Unrelated, just to mark where we're at] > Right. Just reproduced it here. This moves the check at the top so that > each CPU checks its own skew and disables tc_user if necessary. I tweaked the patch locally to make _timekeep a visible global symbol in libc. Printing its value has revealed two issues: * The timekeep page is mapped to the same address for every process. It changes across reboots, but once running, it's always the same. kettenis suggested - vaddr_t va; + vaddr_t va = 0; in exec_timekeep_map(), but that doesn't make a difference. * I'm indeed seeing init(8) with _timekeep == NULL. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
On Sun, Jun 21, 2020 at 08:18:57PM +0200, Christian Weisgerber wrote: > Paul Irofti: > > > Can't test right now, but if you enable the TSC_DEBUG in cpu.c or if you > > put a printf in the CPU_INFO_FOREACH you will probably see the correct > > skew values. > > It's worse: CPU_INFO_FOREACH() only sees cpu0. The others aren't > attached yet. Right. Just reproduced it here. This moves the check at the top so that each CPU checks its own skew and disables tc_user if necessary. diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING
Re: userland clock_gettime proof of concept
Paul Irofti: > > b) Revert _timekeep init (breaks naddy@'s machine) > > Robert helped properly track down this issue to a silly null-ref. If that was indeed the problem... > --- lib/libc/dlfcn/init.c > +++ lib/libc/dlfcn/init.c > @@ -105,6 +107,14 @@ _libc_preinit(int argc, char **argv, char **envp, > dl_cb_cb *cb) > phnum = aux->au_v; > break; > #endif /* !PIC */ > + case AUX_openbsd_timekeep: > + if (_tc_get_timecount) { > + _timekeep = (void *)aux->au_v; > + if (_timekeep && > + _timekeep->tk_version != TK_VERSION) > + _timekeep = NULL; > + } > + break; > } > } > ... how could aux->au_v be NULL here? -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti: > Can't test right now, but if you enable the TSC_DEBUG in cpu.c or if you > put a printf in the CPU_INFO_FOREACH you will probably see the correct > skew values. It's worse: CPU_INFO_FOREACH() only sees cpu0. The others aren't attached yet. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
On Sun, Jun 21, 2020 at 05:44:36PM +0200, Christian Weisgerber wrote: > Paul Irofti: > > > This also handles negative skew values that my prevoius diff did not. > > > --- sys/arch/amd64/amd64/tsc.c > > +++ sys/arch/amd64/amd64/tsc.c > > @@ -216,6 +217,8 @@ tsc_get_timecount(struct timecounter *tc) > > void > > tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq) > > { > > + CPU_INFO_ITERATOR cii; > > + > > #ifdef TSC_DEBUG > > printf("%s: TSC skew=%lld observed drift=%lld\n", __func__, > > (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed); > > @@ -244,8 +247,16 @@ tsc_timecounter_init(struct cpu_info *ci, uint64_t > > cpufreq) > > printf("ERROR: %lld cycle TSC drift observed\n", > > (long long)tsc_drift_observed); > > tsc_timecounter.tc_quality = -1000; > > + tsc_timecounter.tc_user = 0; > > tsc_is_invariant = 0; > > } > > + CPU_INFO_FOREACH(cii, ci) { > > + if (ci->ci_tsc_skew < -TSC_SKEW_MAX || > > + ci->ci_tsc_skew > TSC_SKEW_MAX) { > > + tsc_timecounter.tc_user = 0; > > + break; > > + } > > + } > > > > tc_init(&tsc_timecounter); > > } > > If the output order from TSC_DEBUG in dmesg reflects the actual > execution order, then the relative call order is this: > > cpu0 tsc_timecounter_init > cpu1 cpu_start_secondary > cpu1 tsc_timecounter_init > cpu2 cpu_start_secondary > cpu2 tsc_timecounter_init > cpu3 cpu_start_secondary > cpu3 tsc_timecounter_init > > That CPU_INFO_FOREACH() loop would execute in the very first cpu0 > tsc_timecounter_init() call, _before_ the skews of the other CPUs > are determined in the subsequent cpu_start_secondary() calls. > > So, instead, I think the skew check needs to move to the top of > tsc_timecounter_init, where each secondary CPU checks its own skew > value and knocks out tsc_timecounter.tc_user if there is a problem. > > Unless I'm misunderstanding the whole thing. I think the diff is fine as the skew is computed during cpu_hatch which is the first function called after the MP_TRAMPOLINE and before timecounter_init(). Can't test right now, but if you enable the TSC_DEBUG in cpu.c or if you put a printf in the CPU_INFO_FOREACH you will probably see the correct skew values. If you test before I do and you don't see them, please let me know. Thanks, Paul
Re: userland clock_gettime proof of concept
Paul Irofti: > This also handles negative skew values that my prevoius diff did not. > --- sys/arch/amd64/amd64/tsc.c > +++ sys/arch/amd64/amd64/tsc.c > @@ -216,6 +217,8 @@ tsc_get_timecount(struct timecounter *tc) > void > tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq) > { > + CPU_INFO_ITERATOR cii; > + > #ifdef TSC_DEBUG > printf("%s: TSC skew=%lld observed drift=%lld\n", __func__, > (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed); > @@ -244,8 +247,16 @@ tsc_timecounter_init(struct cpu_info *ci, uint64_t > cpufreq) > printf("ERROR: %lld cycle TSC drift observed\n", > (long long)tsc_drift_observed); > tsc_timecounter.tc_quality = -1000; > + tsc_timecounter.tc_user = 0; > tsc_is_invariant = 0; > } > + CPU_INFO_FOREACH(cii, ci) { > + if (ci->ci_tsc_skew < -TSC_SKEW_MAX || > + ci->ci_tsc_skew > TSC_SKEW_MAX) { > + tsc_timecounter.tc_user = 0; > + break; > + } > + } > > tc_init(&tsc_timecounter); > } If the output order from TSC_DEBUG in dmesg reflects the actual execution order, then the relative call order is this: cpu0 tsc_timecounter_init cpu1 cpu_start_secondary cpu1 tsc_timecounter_init cpu2 cpu_start_secondary cpu2 tsc_timecounter_init cpu3 cpu_start_secondary cpu3 tsc_timecounter_init That CPU_INFO_FOREACH() loop would execute in the very first cpu0 tsc_timecounter_init() call, _before_ the skews of the other CPUs are determined in the subsequent cpu_start_secondary() calls. So, instead, I think the skew check needs to move to the top of tsc_timecounter_init, where each secondary CPU checks its own skew value and knocks out tsc_timecounter.tc_user if there is a problem. Unless I'm misunderstanding the whole thing. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
This also handles negative skew values that my prevoius diff did not. For the last coulpe of weeks people told me that this thread is hard to follow sometimes. You can always get the latest changes here where the actual development takes place. (PR's accepted.) https://github.com/pirofti/openbsd-src/tree/vdso Paul diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE
Re: userland clock_gettime proof of concept
> b) Revert _timekeep init (breaks naddy@'s machine) Robert helped properly track down this issue to a silly null-ref. This new diff addresses this and also does not initialize _timekeep as Mark wanted. diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +static inline u_int +rdtsc(void) +{ + uint32
Re: userland clock_gettime proof of concept
Hi, New iteration that addresses the issues raised by Scott and Mark. a) Use sys/time.h defs by adding _LIBC b) Revert _timekeep init (breaks naddy@'s machine) c) Add TSC_SKEW_MAX thresholding when enabling tc_user d) uint->u_int Item c) adds the code needed for what Mark requested. The value is randomly set at 1,000. As I said earlier I won't do the "research" for this number, but I see a couple other people started to look into it and are discussing with Mark. Good. Paul diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..56016c8eca1 --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,41 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER
Re: userland clock_gettime proof of concept
On 2020-06-20, Christian Weisgerber wrote: > I can't get this revision of the diff to work on amd64: > * patch source > * build and install kernel, reboot > * make build > * reboot -> "Process (pid 1) got signal 11" > > I'm at a loss. As part of the "make build", the new libc is installed > and dynamically linked programs should already be using the userland > gettime calls. Clearly this works. So why does init fail on the > next reboot? I can recover by extracting ./sbin/init from a snapshot in the installer. After that, the system comes up fine in multiuser mode. Nothing else appears to be affected, apart from init. For a while, I had a reproducible situation. When you call init(8) as a normal user in multiuser mode, it will just exit with "init: Operation not permitted". Instead it would segfault! I kept tweaking lib/libc/dlfcn/init.c, rebuilding and reinstalling libc.a, rebuilding init, and watching it segfault. None of the debug write(2)s I inserted would produce any output, it seemed to die before ever reaching _libc_preinit(). I finally ktraced it: 12420 ktrace RET ktrace 0 12420 ktrace CALL execve(0x7f7ec412,0x7f7ec298,0x7f7ec2a8) 12420 ktrace NAMI "./obj/init" 12420 ktrace ARGS [0] = "./obj/init" 12420 init RET execve 0 12420 init PSIG SIGSEGV SIG_DFL code SEGV_MAPERR<1> addr=0x0 trapno=6 12420 init NAMI "init.core" There's not even a kbind(2) there. Then I removed the clearly useless debug write()s... and since then I have a hard time reproducing the problem. It doesn't make any sense. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
> From: Christian Weisgerber > Date: Sat, 20 Jun 2020 19:57:06 - (UTC) > > On 2020-06-19, Paul Irofti wrote: > > > I have addressed your comments bellow, except for the CPU skew one. That > > code disables TSC for all CPUs, not just for PRIMARY. Would you like to > > walk and add code for every CPU to check the drift and then disable the > > TSC? It seems a little too much... > > > > [diff] > > I can't get this revision of the diff to work on amd64: > * patch source > * build and install kernel, reboot > * make build > * reboot -> "Process (pid 1) got signal 11" > > I'm at a loss. As part of the "make build", the new libc is installed > and dynamically linked programs should already be using the userland > gettime calls. Clearly this works. So why does init fail on the > next reboot? Do other static binaries work after your make build finishes? Maybe this is because the _timekeep = NULL -> _timekeep change I suggested that Paul added?
Re: userland clock_gettime proof of concept
On 2020-06-19, Paul Irofti wrote: > I have addressed your comments bellow, except for the CPU skew one. That > code disables TSC for all CPUs, not just for PRIMARY. Would you like to > walk and add code for every CPU to check the drift and then disable the > TSC? It seems a little too much... > > [diff] I can't get this revision of the diff to work on amd64: * patch source * build and install kernel, reboot * make build * reboot -> "Process (pid 1) got signal 11" I'm at a loss. As part of the "make build", the new libc is installed and dynamically linked programs should already be using the userland gettime calls. Clearly this works. So why does init fail on the next reboot? -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
> From: Christian Weisgerber > Date: Sat, 20 Jun 2020 01:20:33 - (UTC) > > On 2020-06-19, Mark Kettenis wrote: > > > I'm talking about *skew*, not drift. If there is a significant drift > > you already knock out the TSC. > > > > What's needed is: > > > > 1. A bit of research of what an acceptable skew is. My hypothesis is > >that on many machines with a single socket the TSCs are actually in > >synch. But the way we measure the skew isn't 100% accurate so we > >still get a small skew. If we sample these values on a couple of > >machines across a couple of reboots we can probably tell what the > >uncertainty in the measurement of the skew is and define a cutoff > >based on that. > > So we need amd64 snapshots to enable TSC_DEBUG, maybe a bit prettier > like below, and then reports: > > cpu0: Intel(R) Xeon(R) CPU E3-1225 v3 @ 3.20GHz, 3392.69 MHz, 06-3c-03 > > cpu0: TSC skew=0 observed drift=0 > cpu1: TSC skew=-24 observed drift=0 > cpu2: TSC skew=-27 observed drift=0 > cpu3: TSC skew=-25 observed drift=0 > > cpu0: TSC skew=0 observed drift=0 > cpu1: TSC skew=-1 observed drift=0 > cpu2: TSC skew=0 observed drift=0 > cpu3: TSC skew=25 observed drift=0 > > cpu0: TSC skew=0 observed drift=0 > cpu1: TSC skew=-30 observed drift=0 > cpu2: TSC skew=-39 observed drift=0 > cpu3: TSC skew=-41 observed drift=0 > > cpu0: TSC skew=0 observed drift=0 > cpu1: TSC skew=-31 observed drift=0 > cpu2: TSC skew=-37 observed drift=0 > cpu3: TSC skew=-39 observed drift=0 > > cpu0: TSC skew=0 observed drift=0 > cpu1: TSC skew=-31 observed drift=0 > cpu2: TSC skew=-34 observed drift=0 > cpu3: TSC skew=-23 observed drift=0 Yes, enabling that for a bit would help us get an idea. We probably should have it print the TSC frequency as well though. > Index: sys/arch/amd64/amd64/tsc.c > === > RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v > retrieving revision 1.16 > diff -u -p -r1.16 tsc.c > --- sys/arch/amd64/amd64/tsc.c6 Apr 2020 00:01:08 - 1.16 > +++ sys/arch/amd64/amd64/tsc.c19 Jun 2020 23:49:06 - > @@ -217,7 +217,7 @@ void > tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq) > { > #ifdef TSC_DEBUG > - printf("%s: TSC skew=%lld observed drift=%lld\n", __func__, > + printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname, > (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed); > #endif > > -- > Christian "naddy" Weisgerber na...@mips.inka.de > >
Re: userland clock_gettime proof of concept
On 2020-06-19, Mark Kettenis wrote: > I'm talking about *skew*, not drift. If there is a significant drift > you already knock out the TSC. > > What's needed is: > > 1. A bit of research of what an acceptable skew is. My hypothesis is >that on many machines with a single socket the TSCs are actually in >synch. But the way we measure the skew isn't 100% accurate so we >still get a small skew. If we sample these values on a couple of >machines across a couple of reboots we can probably tell what the >uncertainty in the measurement of the skew is and define a cutoff >based on that. So we need amd64 snapshots to enable TSC_DEBUG, maybe a bit prettier like below, and then reports: cpu0: Intel(R) Xeon(R) CPU E3-1225 v3 @ 3.20GHz, 3392.69 MHz, 06-3c-03 cpu0: TSC skew=0 observed drift=0 cpu1: TSC skew=-24 observed drift=0 cpu2: TSC skew=-27 observed drift=0 cpu3: TSC skew=-25 observed drift=0 cpu0: TSC skew=0 observed drift=0 cpu1: TSC skew=-1 observed drift=0 cpu2: TSC skew=0 observed drift=0 cpu3: TSC skew=25 observed drift=0 cpu0: TSC skew=0 observed drift=0 cpu1: TSC skew=-30 observed drift=0 cpu2: TSC skew=-39 observed drift=0 cpu3: TSC skew=-41 observed drift=0 cpu0: TSC skew=0 observed drift=0 cpu1: TSC skew=-31 observed drift=0 cpu2: TSC skew=-37 observed drift=0 cpu3: TSC skew=-39 observed drift=0 cpu0: TSC skew=0 observed drift=0 cpu1: TSC skew=-31 observed drift=0 cpu2: TSC skew=-34 observed drift=0 cpu3: TSC skew=-23 observed drift=0 Index: sys/arch/amd64/amd64/tsc.c === RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v retrieving revision 1.16 diff -u -p -r1.16 tsc.c --- sys/arch/amd64/amd64/tsc.c 6 Apr 2020 00:01:08 - 1.16 +++ sys/arch/amd64/amd64/tsc.c 19 Jun 2020 23:49:06 - @@ -217,7 +217,7 @@ void tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq) { #ifdef TSC_DEBUG - printf("%s: TSC skew=%lld observed drift=%lld\n", __func__, + printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname, (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed); #endif -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
> On Jun 19, 2020, at 12:40, Paul Irofti wrote: > > [...] > > + > +static inline void > +bintimeaddfrac(const struct bintime *bt, uint64_t x, struct bintime *ct) > +{ > +ct->sec = bt->sec; > +if (bt->frac > bt->frac + x) > +ct->sec++; > +ct->frac = bt->frac + x; > +} > + > +static inline void > +BINTIME_TO_TIMESPEC(const struct bintime *bt, struct timespec *ts) > +{ > +ts->tv_sec = bt->sec; > +ts->tv_nsec = (long)(((uint64_t)10 * (uint32_t)(bt->frac >> 32)) > >> 32); > +} > + > +static inline void > +BINTIME_TO_TIMEVAL(const struct bintime *bt, struct timeval *tv) > +{ > +tv->tv_sec = bt->sec; > +tv->tv_usec = (long)(((uint64_t)100 * (uint32_t)(bt->frac >> 32)) >> > 32); > +} Is it possible to use the definitions in sys/time.h for these? > + > +static inline void > +bintimeadd(const struct bintime *bt, const struct bintime *ct, > +struct bintime *dt) > +{ > +dt->sec = bt->sec + ct->sec; > +if (bt->frac > bt->frac + ct->frac) > +dt->sec++; > +dt->frac = bt->frac + ct->frac; > +} > + > +static inline void > +bintimesub(const struct bintime *bt, const struct bintime *ct, > +struct bintime *dt) > +{ > +dt->sec = bt->sec - ct->sec; > +if (bt->frac < bt->frac - ct->frac) > +dt->sec--; > +dt->frac = bt->frac - ct->frac; > +} Same thing here.
Re: userland clock_gettime proof of concept
On 2020/06/19 20:28, Paul Irofti wrote: > On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: > > I don't expect userland processes to call CLOCK_UPTIME in a loop like > > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > > doesn't have it ;). > > I don't care eitherway about this. But I don't see why we would not have > this functionality if it is easy to offer. Maybe someone can help us > grep the ports tree for this? Stuart? :) I don't have time to run a grep at the moment but have looked at https://codesearch.debian.net/search?q=CLOCK_UPTIME&literal=1 which I think is good enough to show anything important. Most occurrences are constants in languages. There are a couple of small uses in code but not much e.g. https://sources.debian.org/src/rsyslog/8.2004.0-1/runtime/msg.c/?hl=3857#L3857 https://sources.debian.org/src/mactelnet/0.4.4-4/src/mactelnetd.c/?hl=837#L837 The important ones are gettimeofday/CLOCK_MONOTONIC and to a lesser extent CLOCK_REALTIME.
Re: userland clock_gettime proof of concept
În 19 iunie 2020 23:37:28 EEST, Mark Kettenis a scris: >> Date: Fri, 19 Jun 2020 23:16:26 +0300 >> From: Paul Irofti >> >> În 19 iunie 2020 22:49:32 EEST, Mark Kettenis > a scris: >> >> Date: Fri, 19 Jun 2020 20:28:58 +0300 >> >> From: Paul Irofti >> >> >> >> On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: >> >> > > Date: Fri, 19 Jun 2020 14:31:17 +0300 >> >> > > From: Paul Irofti >> >> > > >> >> > > Hi, >> >> > > >> >> > > Here is another iteration of the diff that addresses all >issues >> >raised >> >> > > in the meantime: >> >> > > >> >> > > - Switch tc to uint >> >> > >> >> > The request was to use u_int, like we de in the kernel. The >uint >> >type >> >> > should not be used in OpenBSD code. >> >> > >> >> > > - Check for version at init and switch to machite/timetc.h >defs >> >> > > - Remove tk_nclocks >> >> > > - Switch to single version and ditch minor/major >> >> > > - Do not enable user TSC for large skew values >> >> > > - Add amd64 clocks and use the define in TSC >> >> > > - Include and add machine/timetc.h >> >> > > >> >> > > As we have seen most architectures have support for clocks now >> >and the >> >> > > above addresses Mark's last concerns. >> >> > > >> >> > > Unless other blocking issues arise, this time around I am >looking >> >for >> >> > > OKs to commit. Theo? Mark? >> >> > >> >> > There is one other issue that I wanted to raise. An that is >> >whether >> >> > we really need to implement CLOCK_UPTINME as a userland clock. >If >> >we >> >> > don't do that we can drop tk_naptime from the shared struct. I >> >> > mention this because th_naptime was only recently added to >struct >> >> > timehands and much more an implementation detail than the other >> >fields. >> >> > >> >> > I don't expect userland processes to call CLOCK_UPTIME in a loop >> >like >> >> > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. >Linux >> >> > doesn't have it ;). >> >> >> >> I don't care eitherway about this. But I don't see why we would >not >> >have >> >> this functionality if it is easy to offer. Maybe someone can help >us >> >> grep the ports tree for this? Stuart? :) >> >> >> >> > We're getting there... >> >> >> >> I have addressed your comments bellow, except for the CPU skew >one. >> >That >> >> code disables TSC for all CPUs, not just for PRIMARY. Would you >like >> >to >> >> walk and add code for every CPU to check the drift and then >disable >> >the >> >> TSC? It seems a little too much... >> > >> >Still uses uint instead of u_int in places. >> >> Ok. I will check that again. >> >> > Still has the pointless >> >extra NULL and 0 for timecounters in files that are otherwise >> >> I am not fixing that. If there's a null present before my diff, then >> there can be a 0 afterwards. If anything my diff unifies this. This >> is silly. > >I'll let others judge that. > >> >And regarding the TSC. That issue is a show-stopper. We can >tolerate >> >a small amout of skew, but not a large amount. Because otherwise a >> >multithreaded process might observe time going backwards. >> >> I don't see how this is still an issue with my diff, which is what I >> said last time. I am stopping the TSC when the drift is larger than >> a random value that I defined a year ago. What more is needed? Can >> you describe in more details? > >I'm talking about *skew*, not drift. If there is a significant drift >you already knock out the TSC. > >What's needed is: > >1. A bit of research of what an acceptable skew is. My hypothesis is > that on many machines with a single socket the TSCs are actually in > synch. But the way we measure the skew isn't 100% accurate so we > still get a small skew. If we sample these values on a couple of > machines across a couple of reboots we can probably tell what the > uncertainty in the measurement of the skew is and define a cutoff > based on that. > >2. When the absolute value of ci->ci_tsc_skew is above this cutoff for > any CPU, you set tsc_timecounter.tc_user to zero. I think you can > do that check in tsc_timecounter_init() but I'm not 100% sure. > Ok. I understand. I will not do that. If somebody wants to, please let us know and continue this discussion with Kettenis until he is satisfied. If the consensus is that this is a blocking issue and nobody wants to step up and do it, then consider the diff abandoned. > >> >> diff --git lib/libc/arch/aarch64/gen/Makefile.inc >> >lib/libc/arch/aarch64/gen/Makefile.inc >> >> index a7b1b73f3ef..ee198f5d611 100644 >> >> --- lib/libc/arch/aarch64/gen/Makefile.inc >> >> +++ lib/libc/arch/aarch64/gen/Makefile.inc >> >> @@ -9,4 +9,4 @@ SRCS+=fpgetmask.c fpgetround.c fpgetsticky.c >> >> SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c >> >> SRCS+= fpclassifyl.c >> >> SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c >> >> -SRCS+= signbitl.c >> >> +SRCS+= signbitl.c usertc.c >> >> diff --git lib/libc/arch/aarch64/gen/usertc.c >> >lib/libc/arch/aar
Re: userland clock_gettime proof of concept
> Date: Fri, 19 Jun 2020 23:16:26 +0300 > From: Paul Irofti > > În 19 iunie 2020 22:49:32 EEST, Mark Kettenis a > scris: > >> Date: Fri, 19 Jun 2020 20:28:58 +0300 > >> From: Paul Irofti > >> > >> On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: > >> > > Date: Fri, 19 Jun 2020 14:31:17 +0300 > >> > > From: Paul Irofti > >> > > > >> > > Hi, > >> > > > >> > > Here is another iteration of the diff that addresses all issues > >raised > >> > > in the meantime: > >> > > > >> > > - Switch tc to uint > >> > > >> > The request was to use u_int, like we de in the kernel. The uint > >type > >> > should not be used in OpenBSD code. > >> > > >> > > - Check for version at init and switch to machite/timetc.h defs > >> > > - Remove tk_nclocks > >> > > - Switch to single version and ditch minor/major > >> > > - Do not enable user TSC for large skew values > >> > > - Add amd64 clocks and use the define in TSC > >> > > - Include and add machine/timetc.h > >> > > > >> > > As we have seen most architectures have support for clocks now > >and the > >> > > above addresses Mark's last concerns. > >> > > > >> > > Unless other blocking issues arise, this time around I am looking > >for > >> > > OKs to commit. Theo? Mark? > >> > > >> > There is one other issue that I wanted to raise. An that is > >whether > >> > we really need to implement CLOCK_UPTINME as a userland clock. If > >we > >> > don't do that we can drop tk_naptime from the shared struct. I > >> > mention this because th_naptime was only recently added to struct > >> > timehands and much more an implementation detail than the other > >fields. > >> > > >> > I don't expect userland processes to call CLOCK_UPTIME in a loop > >like > >> > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > >> > doesn't have it ;). > >> > >> I don't care eitherway about this. But I don't see why we would not > >have > >> this functionality if it is easy to offer. Maybe someone can help us > >> grep the ports tree for this? Stuart? :) > >> > >> > We're getting there... > >> > >> I have addressed your comments bellow, except for the CPU skew one. > >That > >> code disables TSC for all CPUs, not just for PRIMARY. Would you like > >to > >> walk and add code for every CPU to check the drift and then disable > >the > >> TSC? It seems a little too much... > > > >Still uses uint instead of u_int in places. > > Ok. I will check that again. > > > Still has the pointless > >extra NULL and 0 for timecounters in files that are otherwise > > I am not fixing that. If there's a null present before my diff, then > there can be a 0 afterwards. If anything my diff unifies this. This > is silly. I'll let others judge that. > >And regarding the TSC. That issue is a show-stopper. We can tolerate > >a small amout of skew, but not a large amount. Because otherwise a > >multithreaded process might observe time going backwards. > > I don't see how this is still an issue with my diff, which is what I > said last time. I am stopping the TSC when the drift is larger than > a random value that I defined a year ago. What more is needed? Can > you describe in more details? I'm talking about *skew*, not drift. If there is a significant drift you already knock out the TSC. What's needed is: 1. A bit of research of what an acceptable skew is. My hypothesis is that on many machines with a single socket the TSCs are actually in synch. But the way we measure the skew isn't 100% accurate so we still get a small skew. If we sample these values on a couple of machines across a couple of reboots we can probably tell what the uncertainty in the measurement of the skew is and define a cutoff based on that. 2. When the absolute value of ci->ci_tsc_skew is above this cutoff for any CPU, you set tsc_timecounter.tc_user to zero. I think you can do that check in tsc_timecounter_init() but I'm not 100% sure. > >> diff --git lib/libc/arch/aarch64/gen/Makefile.inc > >lib/libc/arch/aarch64/gen/Makefile.inc > >> index a7b1b73f3ef..ee198f5d611 100644 > >> --- lib/libc/arch/aarch64/gen/Makefile.inc > >> +++ lib/libc/arch/aarch64/gen/Makefile.inc > >> @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c > >> SRCS+=fpsetmask.c fpsetround.c fpsetsticky.c > >> SRCS+=fpclassifyl.c > >> SRCS+=isfinitel.c isinfl.c isnanl.c isnormall.c > >> -SRCS+=signbitl.c > >> +SRCS+=signbitl.c usertc.c > >> diff --git lib/libc/arch/aarch64/gen/usertc.c > >lib/libc/arch/aarch64/gen/usertc.c > >> new file mode 100644 > >> index 000..6551854a010 > >> --- /dev/null > >> +++ lib/libc/arch/aarch64/gen/usertc.c > >> @@ -0,0 +1,21 @@ > >> +/*$OpenBSD$ */ > >> +/* > >> + * Copyright (c) 2020 Paul Irofti > >> + * > >> + * Permission to use, copy, modify, and distribute this software for > >any > >> + * purpose with or without fee is hereby granted, provided that the > >above > >> + * copyright notice and this
Re: userland clock_gettime proof of concept
În 19 iunie 2020 22:49:32 EEST, Mark Kettenis a scris: >> Date: Fri, 19 Jun 2020 20:28:58 +0300 >> From: Paul Irofti >> >> On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: >> > > Date: Fri, 19 Jun 2020 14:31:17 +0300 >> > > From: Paul Irofti >> > > >> > > Hi, >> > > >> > > Here is another iteration of the diff that addresses all issues >raised >> > > in the meantime: >> > > >> > > - Switch tc to uint >> > >> > The request was to use u_int, like we de in the kernel. The uint >type >> > should not be used in OpenBSD code. >> > >> > > - Check for version at init and switch to machite/timetc.h defs >> > > - Remove tk_nclocks >> > > - Switch to single version and ditch minor/major >> > > - Do not enable user TSC for large skew values >> > > - Add amd64 clocks and use the define in TSC >> > > - Include and add machine/timetc.h >> > > >> > > As we have seen most architectures have support for clocks now >and the >> > > above addresses Mark's last concerns. >> > > >> > > Unless other blocking issues arise, this time around I am looking >for >> > > OKs to commit. Theo? Mark? >> > >> > There is one other issue that I wanted to raise. An that is >whether >> > we really need to implement CLOCK_UPTINME as a userland clock. If >we >> > don't do that we can drop tk_naptime from the shared struct. I >> > mention this because th_naptime was only recently added to struct >> > timehands and much more an implementation detail than the other >fields. >> > >> > I don't expect userland processes to call CLOCK_UPTIME in a loop >like >> > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux >> > doesn't have it ;). >> >> I don't care eitherway about this. But I don't see why we would not >have >> this functionality if it is easy to offer. Maybe someone can help us >> grep the ports tree for this? Stuart? :) >> >> > We're getting there... >> >> I have addressed your comments bellow, except for the CPU skew one. >That >> code disables TSC for all CPUs, not just for PRIMARY. Would you like >to >> walk and add code for every CPU to check the drift and then disable >the >> TSC? It seems a little too much... > >Still uses uint instead of u_int in places. Ok. I will check that again. > Still has the pointless >extra NULL and 0 for timecounters in files that are otherwise I am not fixing that. If there's a null present before my diff, then there can be a 0 afterwards. If anything my diff unifies this. This is silly. >And regarding the TSC. That issue is a show-stopper. We can tolerate >a small amout of skew, but not a large amount. Because otherwise a >multithreaded process might observe time going backwards. I don't see how this is still an issue with my diff, which is what I said last time. I am stopping the TSC when the drift is larger than a random value that I defined a year ago. What more is needed? Can you describe in more details? Thank you, Paul > >> diff --git lib/libc/arch/aarch64/gen/Makefile.inc >lib/libc/arch/aarch64/gen/Makefile.inc >> index a7b1b73f3ef..ee198f5d611 100644 >> --- lib/libc/arch/aarch64/gen/Makefile.inc >> +++ lib/libc/arch/aarch64/gen/Makefile.inc >> @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c >> SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c >> SRCS+= fpclassifyl.c >> SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c >> -SRCS+= signbitl.c >> +SRCS+= signbitl.c usertc.c >> diff --git lib/libc/arch/aarch64/gen/usertc.c >lib/libc/arch/aarch64/gen/usertc.c >> new file mode 100644 >> index 000..6551854a010 >> --- /dev/null >> +++ lib/libc/arch/aarch64/gen/usertc.c >> @@ -0,0 +1,21 @@ >> +/* $OpenBSD$ */ >> +/* >> + * Copyright (c) 2020 Paul Irofti >> + * >> + * Permission to use, copy, modify, and distribute this software for >any >> + * purpose with or without fee is hereby granted, provided that the >above >> + * copyright notice and this permission notice appear in all copies. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL >WARRANTIES >> + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF >> + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE >LIABLE FOR >> + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY >DAMAGES >> + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER >IN AN >> + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING >OUT OF >> + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. >> + */ >> + >> +#include >> +#include >> + >> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; >> diff --git lib/libc/arch/alpha/gen/Makefile.inc >lib/libc/arch/alpha/gen/Makefile.inc >> index a44599d2cab..2a8abd32b61 100644 >> --- lib/libc/arch/alpha/gen/Makefile.inc >> +++ lib/libc/arch/alpha/gen/Makefile.inc >> @@ -3,5 +3,5 @@ >> >> SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S >> SRCS+= flt_rounds.
Re: userland clock_gettime proof of concept
> Date: Fri, 19 Jun 2020 20:28:58 +0300 > From: Paul Irofti > > On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: > > > Date: Fri, 19 Jun 2020 14:31:17 +0300 > > > From: Paul Irofti > > > > > > Hi, > > > > > > Here is another iteration of the diff that addresses all issues raised > > > in the meantime: > > > > > > - Switch tc to uint > > > > The request was to use u_int, like we de in the kernel. The uint type > > should not be used in OpenBSD code. > > > > > - Check for version at init and switch to machite/timetc.h defs > > > - Remove tk_nclocks > > > - Switch to single version and ditch minor/major > > > - Do not enable user TSC for large skew values > > > - Add amd64 clocks and use the define in TSC > > > - Include and add machine/timetc.h > > > > > > As we have seen most architectures have support for clocks now and the > > > above addresses Mark's last concerns. > > > > > > Unless other blocking issues arise, this time around I am looking for > > > OKs to commit. Theo? Mark? > > > > There is one other issue that I wanted to raise. An that is whether > > we really need to implement CLOCK_UPTINME as a userland clock. If we > > don't do that we can drop tk_naptime from the shared struct. I > > mention this because th_naptime was only recently added to struct > > timehands and much more an implementation detail than the other fields. > > > > I don't expect userland processes to call CLOCK_UPTIME in a loop like > > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > > doesn't have it ;). > > I don't care eitherway about this. But I don't see why we would not have > this functionality if it is easy to offer. Maybe someone can help us > grep the ports tree for this? Stuart? :) > > > We're getting there... > > I have addressed your comments bellow, except for the CPU skew one. That > code disables TSC for all CPUs, not just for PRIMARY. Would you like to > walk and add code for every CPU to check the drift and then disable the > TSC? It seems a little too much... Still uses uint instead of u_int in places. Still has the pointless extra NULL and 0 for timecounters in files that are otherwise And regarding the TSC. That issue is a show-stopper. We can tolerate a small amout of skew, but not a large amount. Because otherwise a multithreaded process might observe time going backwards. > diff --git lib/libc/arch/aarch64/gen/Makefile.inc > lib/libc/arch/aarch64/gen/Makefile.inc > index a7b1b73f3ef..ee198f5d611 100644 > --- lib/libc/arch/aarch64/gen/Makefile.inc > +++ lib/libc/arch/aarch64/gen/Makefile.inc > @@ -9,4 +9,4 @@ SRCS+=fpgetmask.c fpgetround.c fpgetsticky.c > SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c > SRCS+= fpclassifyl.c > SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c > -SRCS+= signbitl.c > +SRCS+= signbitl.c usertc.c > diff --git lib/libc/arch/aarch64/gen/usertc.c > lib/libc/arch/aarch64/gen/usertc.c > new file mode 100644 > index 000..6551854a010 > --- /dev/null > +++ lib/libc/arch/aarch64/gen/usertc.c > @@ -0,0 +1,21 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#include > +#include > + > +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; > diff --git lib/libc/arch/alpha/gen/Makefile.inc > lib/libc/arch/alpha/gen/Makefile.inc > index a44599d2cab..2a8abd32b61 100644 > --- lib/libc/arch/alpha/gen/Makefile.inc > +++ lib/libc/arch/alpha/gen/Makefile.inc > @@ -3,5 +3,5 @@ > > SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S > SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c > \ > - fpsetround.c fpsetsticky.c > + fpsetround.c fpsetsticky.c usertc.c > SRCS+= sigsetjmp.S > diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c > new file mode 100644 > index 000..6551854a010 > --- /dev/null > +++ lib/libc/arch/alpha/gen/usertc.c > @@ -0,0 +1,21 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above
Re: userland clock_gettime proof of concept
> From: "Todd C. Miller" > Date: Fri, 19 Jun 2020 12:24:33 -0600 > > On Fri, 19 Jun 2020 18:52:40 +0200, Mark Kettenis wrote: > > > There is one other issue that I wanted to raise. An that is whether > > we really need to implement CLOCK_UPTINME as a userland clock. If we > > don't do that we can drop tk_naptime from the shared struct. I > > mention this because th_naptime was only recently added to struct > > timehands and much more an implementation detail than the other fields. > > > > I don't expect userland processes to call CLOCK_UPTIME in a loop like > > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > > doesn't have it ;). > > That's not entirely true. On Linux, CLOCK_MONOTONIC does not count > time that the system is suspended so it is analogous to our > CLOCK_UPTIME. On Linux CLOCK_BOOTTIME is the clock that counts > time while suspended. On OpenBSD CLOCK_BOOTTIME and CLOCK_MONOTONIC > are the same but that is not true of other systems. Sure, Linux doesn't faithfully implement POSIX in this respect. But most software doesn't actually care about the difference and Linux does not have CLOCK_UPTIME which was my point. Anyway, if the consensus is that we should offer CLOCK_UPTIME that's fine with me. It just means that we have to be a bit more careful tinkering with the timecounter internals in the future. We can always get out of this hole we're digging by disabling the userland optimization. Yes, folks have to realize that is an optimization, and an optimization that you can't rely on. It won't be available on all machines. And it may not be available when you're doing an upgrade of some sort.
Re: userland clock_gettime proof of concept
On Fri, 19 Jun 2020 18:52:40 +0200, Mark Kettenis wrote: > There is one other issue that I wanted to raise. An that is whether > we really need to implement CLOCK_UPTINME as a userland clock. If we > don't do that we can drop tk_naptime from the shared struct. I > mention this because th_naptime was only recently added to struct > timehands and much more an implementation detail than the other fields. > > I don't expect userland processes to call CLOCK_UPTIME in a loop like > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > doesn't have it ;). That's not entirely true. On Linux, CLOCK_MONOTONIC does not count time that the system is suspended so it is analogous to our CLOCK_UPTIME. On Linux CLOCK_BOOTTIME is the clock that counts time while suspended. On OpenBSD CLOCK_BOOTTIME and CLOCK_MONOTONIC are the same but that is not true of other systems. - todd
Re: userland clock_gettime proof of concept
On Fri, Jun 19, 2020 at 06:52:40PM +0200, Mark Kettenis wrote: > > Date: Fri, 19 Jun 2020 14:31:17 +0300 > > From: Paul Irofti > > > > Hi, > > > > Here is another iteration of the diff that addresses all issues raised > > in the meantime: > > > > - Switch tc to uint > > The request was to use u_int, like we de in the kernel. The uint type > should not be used in OpenBSD code. > > > - Check for version at init and switch to machite/timetc.h defs > > - Remove tk_nclocks > > - Switch to single version and ditch minor/major > > - Do not enable user TSC for large skew values > > - Add amd64 clocks and use the define in TSC > > - Include and add machine/timetc.h > > > > As we have seen most architectures have support for clocks now and the > > above addresses Mark's last concerns. > > > > Unless other blocking issues arise, this time around I am looking for > > OKs to commit. Theo? Mark? > > There is one other issue that I wanted to raise. An that is whether > we really need to implement CLOCK_UPTINME as a userland clock. If we > don't do that we can drop tk_naptime from the shared struct. I > mention this because th_naptime was only recently added to struct > timehands and much more an implementation detail than the other fields. > > I don't expect userland processes to call CLOCK_UPTIME in a loop like > they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux > doesn't have it ;). I don't care eitherway about this. But I don't see why we would not have this functionality if it is easy to offer. Maybe someone can help us grep the ports tree for this? Stuart? :) > We're getting there... I have addressed your comments bellow, except for the CPU skew one. That code disables TSC for all CPUs, not just for PRIMARY. Would you like to walk and add code for every CPU to check the drift and then disable the TSC? It seems a little too much... diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..6551854a010 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timek
Re: userland clock_gettime proof of concept
> Date: Fri, 19 Jun 2020 14:31:17 +0300 > From: Paul Irofti > > Hi, > > Here is another iteration of the diff that addresses all issues raised > in the meantime: > > - Switch tc to uint The request was to use u_int, like we de in the kernel. The uint type should not be used in OpenBSD code. > - Check for version at init and switch to machite/timetc.h defs > - Remove tk_nclocks > - Switch to single version and ditch minor/major > - Do not enable user TSC for large skew values > - Add amd64 clocks and use the define in TSC > - Include and add machine/timetc.h > > As we have seen most architectures have support for clocks now and the > above addresses Mark's last concerns. > > Unless other blocking issues arise, this time around I am looking for > OKs to commit. Theo? Mark? There is one other issue that I wanted to raise. An that is whether we really need to implement CLOCK_UPTINME as a userland clock. If we don't do that we can drop tk_naptime from the shared struct. I mention this because th_naptime was only recently added to struct timehands and much more an implementation detail than the other fields. I don't expect userland processes to call CLOCK_UPTIME in a loop like they tend to do do for CLOCK_MONOTONIC and CLOCK_REALTIME. Linux doesn't have it ;). We're getting there... > diff --git lib/libc/arch/aarch64/gen/Makefile.inc > lib/libc/arch/aarch64/gen/Makefile.inc > index a7b1b73f3ef..ee198f5d611 100644 > --- lib/libc/arch/aarch64/gen/Makefile.inc > +++ lib/libc/arch/aarch64/gen/Makefile.inc > @@ -9,4 +9,4 @@ SRCS+=fpgetmask.c fpgetround.c fpgetsticky.c > SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c > SRCS+= fpclassifyl.c > SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c > -SRCS+= signbitl.c > +SRCS+= signbitl.c usertc.c > diff --git lib/libc/arch/aarch64/gen/usertc.c > lib/libc/arch/aarch64/gen/usertc.c > new file mode 100644 > index 000..3bdea089284 > --- /dev/null > +++ lib/libc/arch/aarch64/gen/usertc.c > @@ -0,0 +1,21 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#include > +#include > + > +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; > diff --git lib/libc/arch/alpha/gen/Makefile.inc > lib/libc/arch/alpha/gen/Makefile.inc > index a44599d2cab..2a8abd32b61 100644 > --- lib/libc/arch/alpha/gen/Makefile.inc > +++ lib/libc/arch/alpha/gen/Makefile.inc > @@ -3,5 +3,5 @@ > > SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S > SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c > \ > - fpsetround.c fpsetsticky.c > + fpsetround.c fpsetsticky.c usertc.c > SRCS+= sigsetjmp.S > diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c > new file mode 100644 > index 000..3bdea089284 > --- /dev/null > +++ lib/libc/arch/alpha/gen/usertc.c > @@ -0,0 +1,21 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#include > +#include > + > +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; > diff --git lib/libc/arch/amd64/gen/Makefile.inc > lib/libc/arch/amd64/gen/Makefile.inc > index e995309ed71..f6349e2b974 100644 > --- lib/libc/arch/amd64/gen/Makefile.inc > +++ lib/libc/arch/amd64/gen/Makefile.inc > @@ -2,6 +2,7 @@ > > SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ > sigsetjmp.S > -SRCS+=
Re: userland clock_gettime proof of concept
On Fri, Jun 19, 2020 at 05:20:24PM +0300, Paul Irofti wrote: > Hi Lucas, > > Will reply inline. > > > As a matter of syntax, there are quite some places with functions > > without parameters defined as `f()` instead of `f(void)`. > > Sure. Good catch. > > > > + if (tc == NULL || tk_user < 1 || tk_user > TC_LAST) > > > > Shouldn't you check for >= TC_LAST in here? Otherwise you'll be reading > > and invoking dragons in the following lines. > > > > *Unless*, the semantic meaning of TC_LAST is to indicate the last valid > > index of get_tc[]. In that case, TC_LAST is defined to 3 in amd64, > > instead of 2. > > You are correct. It should be >= TC_LAST. Fixed locally. > > As a note. This bit will be removed when I commit this. Here it is just for > showing how we can support multiple clocks. On amd64 (at least for now) we > will only have TSC support so all the functions above will also go away and > tc_get_timecount will contain the rdtsc() code. > > The sparc64 bits that will follow this commit will have the correct idiom. So the final diff looks like this (w/o the amd64 multiple clocks PoC). diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..3bdea089284 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..3bdea089284 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..fa0d10207d3 --- /dev/null +++ lib/libc/arch/amd64/gen/
Re: userland clock_gettime proof of concept
Hi Lucas, Will reply inline. As a matter of syntax, there are quite some places with functions without parameters defined as `f()` instead of `f(void)`. Sure. Good catch. + if (tc == NULL || tk_user < 1 || tk_user > TC_LAST) Shouldn't you check for >= TC_LAST in here? Otherwise you'll be reading and invoking dragons in the following lines. *Unless*, the semantic meaning of TC_LAST is to indicate the last valid index of get_tc[]. In that case, TC_LAST is defined to 3 in amd64, instead of 2. You are correct. It should be >= TC_LAST. Fixed locally. As a note. This bit will be removed when I commit this. Here it is just for showing how we can support multiple clocks. On amd64 (at least for now) we will only have TSC support so all the functions above will also go away and tc_get_timecount will contain the rdtsc() code. The sparc64 bits that will follow this commit will have the correct idiom.
Re: userland clock_gettime proof of concept
Hi Paul, I'm just a bystander that likes to read diffs. I can't speak about how the diff fits in the kernel, but I can do some basic checks of correctness. As a matter of syntax, there are quite some places with functions without parameters defined as `f()` instead of `f(void)`. Paul Irofti wrote: > diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c > new file mode 100644 > index 000..ee44d61de4b > --- /dev/null > +++ lib/libc/arch/amd64/gen/usertc.c > @@ -0,0 +1,53 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#include > +#include > + > +static uint > +rdtsc() > +{ > + uint32_t hi, lo; > + asm volatile("rdtsc" : "=a"(lo), "=d"(hi)); > + return ((uint64_t)lo)|(((uint64_t)hi)<<32); > +} > + > +static uint > +acpihpet() > +{ > + return rdtsc(); /* JUST TO COMPILE */ > +} > + > +static uint (*const get_tc[])(void) = > +{ > + [TC_TSC] = rdtsc, > + [TC_HPET] = acpihpet, > +}; > + > +int > +tc_get_timecount(struct timekeep *tk, uint *tc) > +{ > + int tk_user = tk->tk_user; > + > + if (tc == NULL || tk_user < 1 || tk_user > TC_LAST) Shouldn't you check for >= TC_LAST in here? Otherwise you'll be reading and invoking dragons in the following lines. *Unless*, the semantic meaning of TC_LAST is to indicate the last valid index of get_tc[]. In that case, TC_LAST is defined to 3 in amd64, instead of 2. > + return -1; > + > + *tc = (*get_tc[tk_user])(); > + return 0; > +} > +int (*const _tc_get_timecount)(struct timekeep *tk, uint *tc) > + = tc_get_timecount; > diff --git sys/arch/amd64/include/timetc.h sys/arch/amd64/include/timetc.h > new file mode 100644 > index 000..72dfc969a76 > --- /dev/null > +++ sys/arch/amd64/include/timetc.h > @@ -0,0 +1,25 @@ > +/* $OpenBSD$ */ > +/* > + * Copyright (c) 2020 Paul Irofti > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#ifndef _MACHINE_TIMETC_H_ > +#define _MACHINE_TIMETC_H_ > + > +#define TC_TSC 1 > +#define TC_HPET 2 > +#define TC_LAST 3 > + > +#endif /* _MACHINE_TIMETC_H_ */
Re: userland clock_gettime proof of concept
Hi, Here is another iteration of the diff that addresses all issues raised in the meantime: - Switch tc to uint - Check for version at init and switch to machite/timetc.h defs - Remove tk_nclocks - Switch to single version and ditch minor/major - Do not enable user TSC for large skew values - Add amd64 clocks and use the define in TSC - Include and add machine/timetc.h As we have seen most architectures have support for clocks now and the above addresses Mark's last concerns. Unless other blocking issues arise, this time around I am looking for OKs to commit. Theo? Mark? Paul diff --git lib/libc/arch/aarch64/gen/Makefile.inc lib/libc/arch/aarch64/gen/Makefile.inc index a7b1b73f3ef..ee198f5d611 100644 --- lib/libc/arch/aarch64/gen/Makefile.inc +++ lib/libc/arch/aarch64/gen/Makefile.inc @@ -9,4 +9,4 @@ SRCS+= fpgetmask.c fpgetround.c fpgetsticky.c SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c SRCS+= fpclassifyl.c SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c -SRCS+= signbitl.c +SRCS+= signbitl.c usertc.c diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c new file mode 100644 index 000..3bdea089284 --- /dev/null +++ lib/libc/arch/aarch64/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; diff --git lib/libc/arch/alpha/gen/Makefile.inc lib/libc/arch/alpha/gen/Makefile.inc index a44599d2cab..2a8abd32b61 100644 --- lib/libc/arch/alpha/gen/Makefile.inc +++ lib/libc/arch/alpha/gen/Makefile.inc @@ -3,5 +3,5 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \ - fpsetround.c fpsetsticky.c + fpsetround.c fpsetsticky.c usertc.c SRCS+= sigsetjmp.S diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c new file mode 100644 index 000..3bdea089284 --- /dev/null +++ lib/libc/arch/alpha/gen/usertc.c @@ -0,0 +1,21 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include + +int (*const _tc_get_timecount)(struct timekeep *, uint *) = NULL; diff --git lib/libc/arch/amd64/gen/Makefile.inc lib/libc/arch/amd64/gen/Makefile.inc index e995309ed71..f6349e2b974 100644 --- lib/libc/arch/amd64/gen/Makefile.inc +++ lib/libc/arch/amd64/gen/Makefile.inc @@ -2,6 +2,7 @@ SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \ sigsetjmp.S -SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c +SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \ + usertc.c SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \ fpsetround.S fpsetsticky.S diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c new file mode 100644 index 000..ee44d61de4b --- /dev/null +++ lib/libc/arch/amd64/gen/usertc.c @@ -0,0 +1,53 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2020 Paul Irofti + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE A
Re: userland clock_gettime proof of concept
George Koehler: > --- lib/libc/arch/powerpc/gen/usertc.c.before Sat Jun 13 21:28:50 2020 > +++ lib/libc/arch/powerpc/gen/usertc.cSat Jun 13 21:38:52 2020 > @@ -18,4 +18,19 @@ > #include > #include > > -int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) = NULL; > +int > +tc_get_timecount(struct timekeep *tk, uint64_t *tc) > +{ > + uint64_t tb; > + uint32_t scratch; > + > + if (tk->tk_user != 1) > + return -1; > + > + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;" > + " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch)); You only need the lower register. Compare the kernel timecounter: u_int tb_get_timecount(struct timecounter *tc) { return ppc_mftbl(); } As I mentioned before, the declaration of the timecounter value as uint64_t is confusing and should be changed. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
This is macppc, goes on top of your diff from Thu, 11 Jun 2020 16:27:03 +0300. Do integrate this small diff into your big diff. If you need to change the macppc code but can't build it, then do the change, and allow me or someone else to build it. I'm gkoehler@. macppc is a simple arch: there is only 1 timecounter, where tk->tk_user == 1; and we will probably never add another timecounter. The pointer _tc_get_timecount seems unnecessary, because macppc never sets the pointer to NULL. The function tc_get_timecount might work as a static inline function, but then it would need to be in a header (libc/arch/powerpc/usertc.h?) and not in usertc.c. I don't check tk->tk_count. If the count is 2 or 3, then we can ignore it, because we only handle tk->tk_user == 1. If the count is 0, then the kernel should not give us a timekeep page with tk->tk_user == 1. I don't know that tk->tk_count needs to exist. I don't check tc == NULL (but you do in amd64). The only caller tc_delta() in libc/sys/microtime.c never passes NULL. In my last mail (31 May), I wanted to #include and call ppc_mftb(). I change my mind. It is more difficult to change the kernel headers to expose ppc_mftb() outside _KERNEL, and easier to copy the __asm, so I copy the __asm.--George --- lib/libc/arch/powerpc/gen/usertc.c.before Sat Jun 13 21:28:50 2020 +++ lib/libc/arch/powerpc/gen/usertc.c Sat Jun 13 21:38:52 2020 @@ -18,4 +18,19 @@ #include #include -int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) = NULL; +int +tc_get_timecount(struct timekeep *tk, uint64_t *tc) +{ + uint64_t tb; + uint32_t scratch; + + if (tk->tk_user != 1) + return -1; + + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;" + " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch)); + *tc = tb; + return 0; +} +int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) + = tc_get_timecount; --- sys/arch/macppc/macppc/machdep.c.before Sat Jun 13 18:07:27 2020 +++ sys/arch/macppc/macppc/machdep.cSat Jun 13 18:07:38 2020 @@ -351,7 +351,7 @@ } /* timekeep number of user accesible clocks */ -int tk_nclocks = 0; +int tk_nclocks = 1; /* * safepri is a safe priority for sleep to set for a spin-wait --- sys/arch/macppc/macppc/clock.c.before Sat Jun 13 18:39:58 2020 +++ sys/arch/macppc/macppc/clock.c Sat Jun 13 18:40:10 2020 @@ -57,7 +57,7 @@ static int32_t ticks_per_intr; static struct timecounter tb_timecounter = { - tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 0 + tb_get_timecount, NULL, 0x7fff, 0, "tb", 0, NULL, 1 }; /* calibrate the timecounter frequency for the listed models */
Re: userland clock_gettime proof of concept
> From: Paul Irofti > Date: Fri, 12 Jun 2020 12:30:22 +0300 > > On 12.06.2020 10:48, Robert Nagy wrote: > > On 11/06/20 20:10 +0200, Mark Kettenis wrote: > >>> Date: Thu, 11 Jun 2020 19:38:48 +0200 > >>> From: Christian Weisgerber > >>> > >>> Theo de Raadt: > >>> > The diff is growing complexity to support a future which wouldn't > exist if attempts at *supporting all* architectures received priority. > >>> > >>> Adding support for more archs is very simple, since you just need > >>> to copy the corresponding get_timecounter function from the kernel. > >>> > >>> Here's arm64. I'm running a kernel and libc with this. > >>> > >>> I can also provide alpha, powerpc, and sparc64, but I don't have > >>> such machines. > >> > >> Hope you didn't spend too much time on that, because I already > >> mentioned that I had arm64 working earlier in the thread. > >> > >> I've just fired up one of my sparc64 machines such that I can check > >> how well the approach works for an architecture with two exported > >> timecounters. > > > > Then please share the patches so that it can be integrated into the > > main diff so that when the time comes it can go in at one shot. > > > > Also it would help to avoid duplicate work. > > I will respond to deraadt@'s question about where are the clocks that I > mentioned we already have support for and to your message also: > > - there are diffs in this thread providing diffs for those archs > - I did not integrate them in the big diff because I did not want to > get into an argument about who commits what (some people care about > their commits number apparently) > > I can integrate those clocks easily of course if I am allowed to commit > them too at the end, if not, I don't want to get involved in that as I > will probably forget some and drama will happen. > > So from kettenis's comments I gather we have: amd64, macppc, sparc64 and > arm64? I now have a working implementation on sparc64, where I can switch between %tick and %sys_tick while a process doing repeated gettimeofday() calls keeps running. However, on this machine both counters are synchronized so that isn't the most significant test of this functionality.
Re: userland clock_gettime proof of concept
On 12.06.2020 10:48, Robert Nagy wrote: On 11/06/20 20:10 +0200, Mark Kettenis wrote: Date: Thu, 11 Jun 2020 19:38:48 +0200 From: Christian Weisgerber Theo de Raadt: The diff is growing complexity to support a future which wouldn't exist if attempts at *supporting all* architectures received priority. Adding support for more archs is very simple, since you just need to copy the corresponding get_timecounter function from the kernel. Here's arm64. I'm running a kernel and libc with this. I can also provide alpha, powerpc, and sparc64, but I don't have such machines. Hope you didn't spend too much time on that, because I already mentioned that I had arm64 working earlier in the thread. I've just fired up one of my sparc64 machines such that I can check how well the approach works for an architecture with two exported timecounters. Then please share the patches so that it can be integrated into the main diff so that when the time comes it can go in at one shot. Also it would help to avoid duplicate work. I will respond to deraadt@'s question about where are the clocks that I mentioned we already have support for and to your message also: - there are diffs in this thread providing diffs for those archs - I did not integrate them in the big diff because I did not want to get into an argument about who commits what (some people care about their commits number apparently) I can integrate those clocks easily of course if I am allowed to commit them too at the end, if not, I don't want to get involved in that as I will probably forget some and drama will happen. So from kettenis's comments I gather we have: amd64, macppc, sparc64 and arm64?
Re: userland clock_gettime proof of concept
On 11/06/20 20:10 +0200, Mark Kettenis wrote: > > Date: Thu, 11 Jun 2020 19:38:48 +0200 > > From: Christian Weisgerber > > > > Theo de Raadt: > > > > > The diff is growing complexity to support a future which wouldn't > > > exist if attempts at *supporting all* architectures received priority. > > > > Adding support for more archs is very simple, since you just need > > to copy the corresponding get_timecounter function from the kernel. > > > > Here's arm64. I'm running a kernel and libc with this. > > > > I can also provide alpha, powerpc, and sparc64, but I don't have > > such machines. > > Hope you didn't spend too much time on that, because I already > mentioned that I had arm64 working earlier in the thread. > > I've just fired up one of my sparc64 machines such that I can check > how well the approach works for an architecture with two exported > timecounters. Then please share the patches so that it can be integrated into the main diff so that when the time comes it can go in at one shot. Also it would help to avoid duplicate work.
Re: userland clock_gettime proof of concept
Mark Kettenis wrote: > suggestion to have an MD header file actually helps. With that file > we can easily avoid the function pointer and it becomes easier adding > new timecounters in the kernel (which happens from time to time). When does this happen? If it happens, what's the fuss? The new code handles the new timecounter. Versioning tends to be important for removing support, or changing it in a way that looks like a removal. Additions can be indicated by the timecounter # being higher than before.
Re: userland clock_gettime proof of concept
> Date: Thu, 11 Jun 2020 19:38:48 +0200 > From: Christian Weisgerber > > Theo de Raadt: > > > The diff is growing complexity to support a future which wouldn't > > exist if attempts at *supporting all* architectures received priority. > > Adding support for more archs is very simple, since you just need > to copy the corresponding get_timecounter function from the kernel. > > Here's arm64. I'm running a kernel and libc with this. > > I can also provide alpha, powerpc, and sparc64, but I don't have > such machines. Hope you didn't spend too much time on that, because I already mentioned that I had arm64 working earlier in the thread. I've just fired up one of my sparc64 machines such that I can check how well the approach works for an architecture with two exported timecounters. As for decreasing the complexity of the diff, I think naddy's suggestion to have an MD header file actually helps. With that file we can easily avoid the function pointer and it becomes easier adding new timecounters in the kernel (which happens from time to time). Further opportunities to decrease the complexity: 1. Drop support for CLOCK_UPTIME and CLOCK_BOOTTIME. 2. Avoid using bintime. That would decouple the libc code more from the kernel timecounter implementation details. The downside of this is that we can't just copy and paste kernel code like we do now so this may not be a genuine benefit. I consider that mostly polishing though, although the polishing has to be done before this hits the tree. The real question that still need to be answered are: 1. Are we happy with an implementation that can only support clock/architectures that have a clock that can be accessed through a register or special instruction? 2. What are we going to do with the TSC on amd64 with regards to them not being synched up properly. For me the answers are: 1. Yes. 2. Not expose the TSC to userland if there is a significant skew. Cheers, Mark > --- ./lib/libc/arch/aarch64/gen/usertc.c.orig Thu Jun 11 19:07:39 2020 > +++ ./lib/libc/arch/aarch64/gen/usertc.c Thu Jun 11 19:08:01 2020 > @@ -18,4 +18,32 @@ > #include > #include > > -int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) = NULL; > +static uint64_t > +readcnt64(void) > +{ > + uint64_t val0, val1; > + > + /* > + * Work around Cortex-A73 errata 858921, where there is a > + * one-cycle window where the read might return the old value > + * for the low 32 bits and the new value for the high 32 bits > + * upon roll-over of the low 32 bits. > + */ > + __asm volatile("isb" : : : "memory"); > + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val0)); > + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val1)); > + return ((val0 ^ val1) & 0x1ULL) ? val0 : val1; > +} > + > +int > +tc_get_timecount(struct timekeep *tk, uint64_t *tc) > +{ > + if (tc == NULL || tk->tk_user != 1) > + return -1; > + > + *tc = readcnt64(); > + return 0; > +} > + > +int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) > + = tc_get_timecount; > --- ./sys/arch/arm64/arm64/machdep.c.orig Thu Jun 11 17:46:54 2020 > +++ ./sys/arch/arm64/arm64/machdep.c Thu Jun 11 17:46:59 2020 > @@ -91,7 +91,7 @@ > charmachine[] = MACHINE;/* from */ > > /* timekeep number of user accesible clocks */ > -int tk_nclocks = 0; > +int tk_nclocks = 1; > > int safepri = 0; > > --- ./sys/arch/arm64/dev/agtimer.c.orig Thu Jun 11 17:47:23 2020 > +++ ./sys/arch/arm64/dev/agtimer.cThu Jun 11 17:47:27 2020 > @@ -43,7 +43,7 @@ > u_int agtimer_get_timecount(struct timecounter *); > > static struct timecounter agtimer_timecounter = { > - agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 0 > + agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 1 > }; > > struct agtimer_pcpu_softc { > -- > Christian "naddy" Weisgerber na...@mips.inka.de > >
Re: userland clock_gettime proof of concept
Theo de Raadt: > The diff is growing complexity to support a future which wouldn't > exist if attempts at *supporting all* architectures received priority. Adding support for more archs is very simple, since you just need to copy the corresponding get_timecounter function from the kernel. Here's arm64. I'm running a kernel and libc with this. I can also provide alpha, powerpc, and sparc64, but I don't have such machines. --- ./lib/libc/arch/aarch64/gen/usertc.c.orig Thu Jun 11 19:07:39 2020 +++ ./lib/libc/arch/aarch64/gen/usertc.cThu Jun 11 19:08:01 2020 @@ -18,4 +18,32 @@ #include #include -int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) = NULL; +static uint64_t +readcnt64(void) +{ + uint64_t val0, val1; + + /* +* Work around Cortex-A73 errata 858921, where there is a +* one-cycle window where the read might return the old value +* for the low 32 bits and the new value for the high 32 bits +* upon roll-over of the low 32 bits. +*/ + __asm volatile("isb" : : : "memory"); + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val0)); + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val1)); + return ((val0 ^ val1) & 0x1ULL) ? val0 : val1; +} + +int +tc_get_timecount(struct timekeep *tk, uint64_t *tc) +{ + if (tc == NULL || tk->tk_user != 1) + return -1; + + *tc = readcnt64(); + return 0; +} + +int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) + = tc_get_timecount; --- ./sys/arch/arm64/arm64/machdep.c.orig Thu Jun 11 17:46:54 2020 +++ ./sys/arch/arm64/arm64/machdep.cThu Jun 11 17:46:59 2020 @@ -91,7 +91,7 @@ charmachine[] = MACHINE;/* from */ /* timekeep number of user accesible clocks */ -int tk_nclocks = 0; +int tk_nclocks = 1; int safepri = 0; --- ./sys/arch/arm64/dev/agtimer.c.orig Thu Jun 11 17:47:23 2020 +++ ./sys/arch/arm64/dev/agtimer.c Thu Jun 11 17:47:27 2020 @@ -43,7 +43,7 @@ u_int agtimer_get_timecount(struct timecounter *); static struct timecounter agtimer_timecounter = { - agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 0 + agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 1 }; struct agtimer_pcpu_softc { -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Mark Kettenis wrote: > > Date: Thu, 11 Jun 2020 17:29:44 +0200 > > From: Christian Weisgerber > > > > Paul Irofti: > > > > > > Paul, that tk_nclocks addition isn't useful. You need to do the > > > > bounds checking against the number of clocks you have implemented in > > > > libc. How many clocks the kernel has implemented doesn't matter. > > > > > > What you are saying is that we could be in a situation where the kernel > > > might expose 3 clocks but we only have 2 entries in libc? Why would we get > > > to that point? When someone changes the clock in the kernel, that means it > > > is also changed in libc. I don't think we can decouple the two parts. > > > Right? > > > > But we do: make kernel; install kernel; reboot; make build. > > > > To cross from nclocks to nclocks+1, you need to run the new nclocks+1 > > kernel with an nclocks userland. > > > > I keep coming back to the idea that we need an > > header with > > > > #define TC_FOO 1 > > #define TC_BAR 2 > > #define TC_NUM 3/* or TC_LAST or whatever */ > > > > Mark may have a better idea how to name this. > > I think is fine. Architectures without exportable > timecounters could just do > > #define TC_LAST 0 > > (I do think TC_LAST is a bit better for this reason). What is the cause of all this complex versioning? We'll have architectures which will use this, and they have one clock. We'll have architectures which can't export clocks and won't use this. We'll have architectures with 1 or 2 clocks, and need to make a choice. If we can't make a choice, the old code is used. So the safe choice always remains. The complexity of this proposal is so people can build through the ABI break if the structure changes, but I argue, it should *not get changed ever* because it should be properly designed for *all needs* from the getgo. Once an architecture supports this method, why does it ever need to change? If this solution was MD, we wouldn't even be having this discussion about versioning. amd64 would switch over, and the work would be done. And if something needs a tweak, we would handle it as regular ABI break, something we understand. Per-architecture. But it is being built MI. So the only reason I see for this complexity, is if the MI design didn't anticipate the needs of some architecture. And then adding support, means the MI structure has to change. So this argument is because the diff *does not try to support all architectures yet* and *fails to ancicipate their needs* The diff is growing complexity to support a future which wouldn't exist if attempts at *supporting all* architectures received priority. Then, the structure would be correct on the day it gets commited. Instead, deckchairs are being staged.
Re: userland clock_gettime proof of concept
> Date: Thu, 11 Jun 2020 17:29:44 +0200 > From: Christian Weisgerber > > Paul Irofti: > > > > Paul, that tk_nclocks addition isn't useful. You need to do the > > > bounds checking against the number of clocks you have implemented in > > > libc. How many clocks the kernel has implemented doesn't matter. > > > > What you are saying is that we could be in a situation where the kernel > > might expose 3 clocks but we only have 2 entries in libc? Why would we get > > to that point? When someone changes the clock in the kernel, that means it > > is also changed in libc. I don't think we can decouple the two parts. Right? > > But we do: make kernel; install kernel; reboot; make build. > > To cross from nclocks to nclocks+1, you need to run the new nclocks+1 > kernel with an nclocks userland. > > I keep coming back to the idea that we need an > header with > > #define TC_FOO 1 > #define TC_BAR 2 > #define TC_NUM 3/* or TC_LAST or whatever */ > > Mark may have a better idea how to name this. I think is fine. Architectures without exportable timecounters could just do #define TC_LAST 0 (I do think TC_LAST is a bit better for this reason).
Re: userland clock_gettime proof of concept
Paul Irofti: > > Paul, that tk_nclocks addition isn't useful. You need to do the > > bounds checking against the number of clocks you have implemented in > > libc. How many clocks the kernel has implemented doesn't matter. > > What you are saying is that we could be in a situation where the kernel > might expose 3 clocks but we only have 2 entries in libc? Why would we get > to that point? When someone changes the clock in the kernel, that means it > is also changed in libc. I don't think we can decouple the two parts. Right? But we do: make kernel; install kernel; reboot; make build. To cross from nclocks to nclocks+1, you need to run the new nclocks+1 kernel with an nclocks userland. I keep coming back to the idea that we need an header with #define TC_FOO 1 #define TC_BAR 2 #define TC_NUM 3/* or TC_LAST or whatever */ Mark may have a better idea how to name this. -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti: [lib/libc/arch/*/gen/usertc.c] > +int (*const _tc_get_timecount)(struct timekeep *, uint64_t *) = NULL; [lib/libc/sys/microtime.c] > +static inline int > +tc_delta(struct timekeep *tk, u_int *delta) > +{ > + uint64_t tc; > + if (delta == NULL || _tc_get_timecount(tk, &tc)) This use of uint64_t for the timecounter return value confused me. All the kernel code uses u_int as the return value of the timecounter_get functions. The userland code should do the same (or uint32_t or whatever the preferred alias is). A central idea is that we can just copy the MD timecounter_get functions from the kernel to the userland code. They all return only 32 bits. (If the hardware register has more bits, like TSC or TICK, the value is truncated.) -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: userland clock_gettime proof of concept
Paul Irofti wrote: > > I want to see this diff support 3-4 architectures before commit. > > Sure. Whenever you feel confident. As I said numerous times now here, > nobody is pressuring this with a commit. > > I think we support already 3 architectures: amd64, macppc, sparc64 > (kettenis?). I don't see it.
Re: userland clock_gettime proof of concept
Paul Irofti wrote: > On Thu, Jun 11, 2020 at 02:05:44PM +0200, Marc Espie wrote: > > On Thu, Jun 11, 2020 at 01:13:07PM +0300, Paul Irofti wrote: > > > On 2020-06-11 02:46, Christian Weisgerber wrote: > > > > Paul Irofti: > > > > > > > > > This iteration of the diff adds bounds checking for tk_user and moves > > > > > the usertc.c stub to every arch in libc as recommanded by deraadt@. > > > > > It also fixes a gettimeofday issue reported by cheloha@ and tb@. > > > > > > > > Additionally, it changes struct timekeep in an incompatible way. ;-) > > > > A userland built before the addition of tk_nclocks is very unhappy > > > > with a kernel built afterwards. There is no way to compile across > > > > this. You have to (U)pgrade from boot media to install a > > > > ftp.openbsd.org > > > > userland, and then you can re-compile with the new diff. > > > > > > I have not seen this problem and have not built a snapshot to update or go > > > back. What do you mean by very unhappy? Can you show me the exact steps > > > you > > > have done? > > > > > > > Should we already bump major while the diff matures? > > > > > > I am not a fan of this. I don't like bumping something before it is > > > actually > > > used. It is like an errata before a release. > > > > So what if we end at version 200 ? > > > > we've got a full uint32_t for crying out loud, you're not going to run out > > of numbers. > > > > Besides, it's something that's entirely invisible to users, even more so > > than library major/minors. > > This is not about the range available to us. > > If I bump then I will have to also add checks for the revision. > Otherwise what is the point of the bump? And then what? Keep old and > new code around for both revisions? yes, because the "old code" remains. It's the system calls we use today.