Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/13/2018 09:35 PM, Andy Lutomirski wrote: Somewhat special, yes, but not overly so, and not in the type-polymorphic sense. We can't give direct access of the vDSO implementation to applications because the kernel does not know about the userspace errno variable. We do that for time on x86_64, where applications call into the vDSO directly, bypassing glibc completely after binding. If the vDSO adds special helpers for CLOCK_MONOTONIC and CLOCK_REALTIME, I think we can reasonably safely promise that they never fail. (seccomp can obviously break that promise if there’s no TSC, but I think that seccomp users who do that get to keep both pieces.) I agree, I thought about the same thing. We already do not return EFAULT for invalid pointers, for obvious reasons. And if the clock ID is fixed, the EINVAL error is impossible. That would shave off a few nanoseconds more if the calling convention is identical to what glibc exposes to applications. If the vDSO is not available or the symbol is missing, we can provide an implementation based on the current clock_gettime in glibc. Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/13/2018 09:35 PM, Andy Lutomirski wrote: Somewhat special, yes, but not overly so, and not in the type-polymorphic sense. We can't give direct access of the vDSO implementation to applications because the kernel does not know about the userspace errno variable. We do that for time on x86_64, where applications call into the vDSO directly, bypassing glibc completely after binding. If the vDSO adds special helpers for CLOCK_MONOTONIC and CLOCK_REALTIME, I think we can reasonably safely promise that they never fail. (seccomp can obviously break that promise if there’s no TSC, but I think that seccomp users who do that get to keep both pieces.) I agree, I thought about the same thing. We already do not return EFAULT for invalid pointers, for obvious reasons. And if the clock ID is fixed, the EINVAL error is impossible. That would shave off a few nanoseconds more if the calling convention is identical to what glibc exposes to applications. If the vDSO is not available or the symbol is missing, we can provide an implementation based on the current clock_gettime in glibc. Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 13, 2018, at 12:07 PM, Florian Weimer wrote: > > On 09/13/2018 05:22 PM, Andy Lutomirski wrote: >>> On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: >>> >>> On 09/12/2018 07:11 PM, Andy Lutomirski wrote: > The multiplexer interfaces need much more surgery and talking about futex, > we'd need to sit down with quite some people and identify the things they > actually care about before just splitting it up and keeping the existing > overloaded trainwreck the same. > There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. >>> >>> My concern is that the userspace system call wrappers currently do not know >>> how many arguments the individual operations take and what types the >>> arguments have (hence the “type-polymorphic” nature I mentioned). This >>> could be a problem for on-stack argument passing (where you might read >>> values beyond the end of the stack, and glibc avoids that most of the time >>> by having enough cruft on the stack), and for architectures which pass >>> pointers and integers in different registers (like some m68k ABIs do for >>> the return value). > >> Isn’t clock_gettime already special because of the vDSO entry point, though? > > Somewhat special, yes, but not overly so, and not in the type-polymorphic > sense. We can't give direct access of the vDSO implementation to > applications because the kernel does not know about the userspace errno > variable. We do that for time on x86_64, where applications call into the > vDSO directly, bypassing glibc completely after binding. If the vDSO adds special helpers for CLOCK_MONOTONIC and CLOCK_REALTIME, I think we can reasonably safely promise that they never fail. (seccomp can obviously break that promise if there’s no TSC, but I think that seccomp users who do that get to keep both pieces.)
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 13, 2018, at 12:07 PM, Florian Weimer wrote: > > On 09/13/2018 05:22 PM, Andy Lutomirski wrote: >>> On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: >>> >>> On 09/12/2018 07:11 PM, Andy Lutomirski wrote: > The multiplexer interfaces need much more surgery and talking about futex, > we'd need to sit down with quite some people and identify the things they > actually care about before just splitting it up and keeping the existing > overloaded trainwreck the same. > There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. >>> >>> My concern is that the userspace system call wrappers currently do not know >>> how many arguments the individual operations take and what types the >>> arguments have (hence the “type-polymorphic” nature I mentioned). This >>> could be a problem for on-stack argument passing (where you might read >>> values beyond the end of the stack, and glibc avoids that most of the time >>> by having enough cruft on the stack), and for architectures which pass >>> pointers and integers in different registers (like some m68k ABIs do for >>> the return value). > >> Isn’t clock_gettime already special because of the vDSO entry point, though? > > Somewhat special, yes, but not overly so, and not in the type-polymorphic > sense. We can't give direct access of the vDSO implementation to > applications because the kernel does not know about the userspace errno > variable. We do that for time on x86_64, where applications call into the > vDSO directly, bypassing glibc completely after binding. If the vDSO adds special helpers for CLOCK_MONOTONIC and CLOCK_REALTIME, I think we can reasonably safely promise that they never fail. (seccomp can obviously break that promise if there’s no TSC, but I think that seccomp users who do that get to keep both pieces.)
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/13/2018 05:22 PM, Andy Lutomirski wrote: On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: On 09/12/2018 07:11 PM, Andy Lutomirski wrote: The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. My concern is that the userspace system call wrappers currently do not know how many arguments the individual operations take and what types the arguments have (hence the “type-polymorphic” nature I mentioned). This could be a problem for on-stack argument passing (where you might read values beyond the end of the stack, and glibc avoids that most of the time by having enough cruft on the stack), and for architectures which pass pointers and integers in different registers (like some m68k ABIs do for the return value). Isn’t clock_gettime already special because of the vDSO entry point, though? Somewhat special, yes, but not overly so, and not in the type-polymorphic sense. We can't give direct access of the vDSO implementation to applications because the kernel does not know about the userspace errno variable. We do that for time on x86_64, where applications call into the vDSO directly, bypassing glibc completely after binding. I suspect most Linux libcs that know about the vDSO at all have generic vsyscall support, just like they have generic support for plain system calls. Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/13/2018 05:22 PM, Andy Lutomirski wrote: On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: On 09/12/2018 07:11 PM, Andy Lutomirski wrote: The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. My concern is that the userspace system call wrappers currently do not know how many arguments the individual operations take and what types the arguments have (hence the “type-polymorphic” nature I mentioned). This could be a problem for on-stack argument passing (where you might read values beyond the end of the stack, and glibc avoids that most of the time by having enough cruft on the stack), and for architectures which pass pointers and integers in different registers (like some m68k ABIs do for the return value). Isn’t clock_gettime already special because of the vDSO entry point, though? Somewhat special, yes, but not overly so, and not in the type-polymorphic sense. We can't give direct access of the vDSO implementation to applications because the kernel does not know about the userspace errno variable. We do that for time on x86_64, where applications call into the vDSO directly, bypassing glibc completely after binding. I suspect most Linux libcs that know about the vDSO at all have generic vsyscall support, just like they have generic support for plain system calls. Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: > > On 09/12/2018 07:11 PM, Andy Lutomirski wrote: >>> The multiplexer interfaces need much more surgery and talking about futex, >>> we'd need to sit down with quite some people and identify the things they >>> actually care about before just splitting it up and keeping the existing >>> overloaded trainwreck the same. >>> >> There’s also the issue of how much the speedup matters. For futex, maybe a >> better interface saves 3ns, but a futex syscall is hundreds of ns. >> clock_gettime() is called at high frequency and can be ~25ns. Saving a few >> ns is a bigger deal. > > My concern is that the userspace system call wrappers currently do not know > how many arguments the individual operations take and what types the > arguments have (hence the “type-polymorphic” nature I mentioned). This could > be a problem for on-stack argument passing (where you might read values > beyond the end of the stack, and glibc avoids that most of the time by having > enough cruft on the stack), and for architectures which pass pointers and > integers in different registers (like some m68k ABIs do for the return value). > > Isn’t clock_gettime already special because of the vDSO entry point, though?
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 13, 2018, at 1:07 AM, Florian Weimer wrote: > > On 09/12/2018 07:11 PM, Andy Lutomirski wrote: >>> The multiplexer interfaces need much more surgery and talking about futex, >>> we'd need to sit down with quite some people and identify the things they >>> actually care about before just splitting it up and keeping the existing >>> overloaded trainwreck the same. >>> >> There’s also the issue of how much the speedup matters. For futex, maybe a >> better interface saves 3ns, but a futex syscall is hundreds of ns. >> clock_gettime() is called at high frequency and can be ~25ns. Saving a few >> ns is a bigger deal. > > My concern is that the userspace system call wrappers currently do not know > how many arguments the individual operations take and what types the > arguments have (hence the “type-polymorphic” nature I mentioned). This could > be a problem for on-stack argument passing (where you might read values > beyond the end of the stack, and glibc avoids that most of the time by having > enough cruft on the stack), and for architectures which pass pointers and > integers in different registers (like some m68k ABIs do for the return value). > > Isn’t clock_gettime already special because of the vDSO entry point, though?
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/12/2018 07:11 PM, Andy Lutomirski wrote: The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. My concern is that the userspace system call wrappers currently do not know how many arguments the individual operations take and what types the arguments have (hence the “type-polymorphic” nature I mentioned). This could be a problem for on-stack argument passing (where you might read values beyond the end of the stack, and glibc avoids that most of the time by having enough cruft on the stack), and for architectures which pass pointers and integers in different registers (like some m68k ABIs do for the return value). Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/12/2018 07:11 PM, Andy Lutomirski wrote: The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal. My concern is that the userspace system call wrappers currently do not know how many arguments the individual operations take and what types the arguments have (hence the “type-polymorphic” nature I mentioned). This could be a problem for on-stack argument passing (where you might read values beyond the end of the stack, and glibc avoids that most of the time by having enough cruft on the stack), and for architectures which pass pointers and integers in different registers (like some m68k ABIs do for the return value). Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 12, 2018, at 7:29 AM, Thomas Gleixner wrote: > >> On Wed, 12 Sep 2018, Florian Weimer wrote: >>> On 09/12/2018 04:17 PM, Thomas Gleixner wrote: On Wed, 12 Sep 2018, Florian Weimer wrote: Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? >>> >>> We have no intention to change that. >> >> Okay, I was wondering because Andy seemed to have proposed just that. >> >>> But we surely could provide separate entry points as an extra to avoid a >>> bunch of conditionals. >> >> We could adjust to that, but the benefit would be long-term because it's an >> ABI change for glibc, and they tend to take a long time to propagate. >> >> But I must say that clock_gettime is an odd place to start. I would have >> expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, >> ptrace, futex) to be a more natural starting point. 8-) > > Well, the starting point of this was to provide clock_tai support in the > vdso. clock_gettime() in the vdso vs. the real syscall is a factor of 10 in > speed. clock_gettime() is a pretty hot function in some workloads. > > Andy then noticed that some conditionals could be avoided entirely by using > a different entry point and offered one along with a 10% speedup. We don't > have to go there, we can. > > The multiplexer interfaces need much more surgery and talking about futex, > we'd need to sit down with quite some people and identify the things they > actually care about before just splitting it up and keeping the existing > overloaded trainwreck the same. > There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal.
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
> On Sep 12, 2018, at 7:29 AM, Thomas Gleixner wrote: > >> On Wed, 12 Sep 2018, Florian Weimer wrote: >>> On 09/12/2018 04:17 PM, Thomas Gleixner wrote: On Wed, 12 Sep 2018, Florian Weimer wrote: Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? >>> >>> We have no intention to change that. >> >> Okay, I was wondering because Andy seemed to have proposed just that. >> >>> But we surely could provide separate entry points as an extra to avoid a >>> bunch of conditionals. >> >> We could adjust to that, but the benefit would be long-term because it's an >> ABI change for glibc, and they tend to take a long time to propagate. >> >> But I must say that clock_gettime is an odd place to start. I would have >> expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, >> ptrace, futex) to be a more natural starting point. 8-) > > Well, the starting point of this was to provide clock_tai support in the > vdso. clock_gettime() in the vdso vs. the real syscall is a factor of 10 in > speed. clock_gettime() is a pretty hot function in some workloads. > > Andy then noticed that some conditionals could be avoided entirely by using > a different entry point and offered one along with a 10% speedup. We don't > have to go there, we can. > > The multiplexer interfaces need much more surgery and talking about futex, > we'd need to sit down with quite some people and identify the things they > actually care about before just splitting it up and keeping the existing > overloaded trainwreck the same. > There’s also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal.
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Wed, 12 Sep 2018, Florian Weimer wrote: > On 09/12/2018 04:17 PM, Thomas Gleixner wrote: > > On Wed, 12 Sep 2018, Florian Weimer wrote: > > > Does this mean glibc can keep using a single vDSO entrypoint, the one we > > > have today? > > > > We have no intention to change that. > > Okay, I was wondering because Andy seemed to have proposed just that. > > > But we surely could provide separate entry points as an extra to avoid a > > bunch of conditionals. > > We could adjust to that, but the benefit would be long-term because it's an > ABI change for glibc, and they tend to take a long time to propagate. > > But I must say that clock_gettime is an odd place to start. I would have > expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, > ptrace, futex) to be a more natural starting point. 8-) Well, the starting point of this was to provide clock_tai support in the vdso. clock_gettime() in the vdso vs. the real syscall is a factor of 10 in speed. clock_gettime() is a pretty hot function in some workloads. Andy then noticed that some conditionals could be avoided entirely by using a different entry point and offered one along with a 10% speedup. We don't have to go there, we can. The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Wed, 12 Sep 2018, Florian Weimer wrote: > On 09/12/2018 04:17 PM, Thomas Gleixner wrote: > > On Wed, 12 Sep 2018, Florian Weimer wrote: > > > Does this mean glibc can keep using a single vDSO entrypoint, the one we > > > have today? > > > > We have no intention to change that. > > Okay, I was wondering because Andy seemed to have proposed just that. > > > But we surely could provide separate entry points as an extra to avoid a > > bunch of conditionals. > > We could adjust to that, but the benefit would be long-term because it's an > ABI change for glibc, and they tend to take a long time to propagate. > > But I must say that clock_gettime is an odd place to start. I would have > expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, > ptrace, futex) to be a more natural starting point. 8-) Well, the starting point of this was to provide clock_tai support in the vdso. clock_gettime() in the vdso vs. the real syscall is a factor of 10 in speed. clock_gettime() is a pretty hot function in some workloads. Andy then noticed that some conditionals could be avoided entirely by using a different entry point and offered one along with a 10% speedup. We don't have to go there, we can. The multiplexer interfaces need much more surgery and talking about futex, we'd need to sit down with quite some people and identify the things they actually care about before just splitting it up and keeping the existing overloaded trainwreck the same. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/12/2018 04:17 PM, Thomas Gleixner wrote: On Wed, 12 Sep 2018, Florian Weimer wrote: On 09/09/2018 10:05 PM, Thomas Gleixner wrote: See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? We have no intention to change that. Okay, I was wondering because Andy seemed to have proposed just that. But we surely could provide separate entry points as an extra to avoid a bunch of conditionals. We could adjust to that, but the benefit would be long-term because it's an ABI change for glibc, and they tend to take a long time to propagate. But I must say that clock_gettime is an odd place to start. I would have expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, ptrace, futex) to be a more natural starting point. 8-) Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/12/2018 04:17 PM, Thomas Gleixner wrote: On Wed, 12 Sep 2018, Florian Weimer wrote: On 09/09/2018 10:05 PM, Thomas Gleixner wrote: See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? We have no intention to change that. Okay, I was wondering because Andy seemed to have proposed just that. But we surely could provide separate entry points as an extra to avoid a bunch of conditionals. We could adjust to that, but the benefit would be long-term because it's an ABI change for glibc, and they tend to take a long time to propagate. But I must say that clock_gettime is an odd place to start. I would have expected any of the type-polymorphic multiplexer interfaces (fcntl, ioctl, ptrace, futex) to be a more natural starting point. 8-) Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Wed, 12 Sep 2018, Florian Weimer wrote: > On 09/09/2018 10:05 PM, Thomas Gleixner wrote: > > See the patch below. It's integrating TAI without slowing down everything > > and it definitely does not result in indirect calls. > > > > On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup > > by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends > > on the uarch and I also observed compiler version dependend variations. > > Does this mean glibc can keep using a single vDSO entrypoint, the one we > have today? We have no intention to change that. But we surely could provide separate entry points as an extra to avoid a bunch of conditionals. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Wed, 12 Sep 2018, Florian Weimer wrote: > On 09/09/2018 10:05 PM, Thomas Gleixner wrote: > > See the patch below. It's integrating TAI without slowing down everything > > and it definitely does not result in indirect calls. > > > > On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup > > by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends > > on the uarch and I also observed compiler version dependend variations. > > Does this mean glibc can keep using a single vDSO entrypoint, the one we > have today? We have no intention to change that. But we surely could provide separate entry points as an extra to avoid a bunch of conditionals. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/09/2018 10:05 PM, Thomas Gleixner wrote: See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/09/2018 10:05 PM, Thomas Gleixner wrote: See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Does this mean glibc can keep using a single vDSO entrypoint, the one we have today? Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Sun, 9 Sep 2018, Thomas Gleixner wrote: > #ifdef BUILD_VDSO32_64 > typedef u64 gtod_long_t; > #else > typedef unsigned long gtod_long_t; > #endif > + > +struct vgtod_ts { > + gtod_long_t sec; and actually this wants to become u64 unconditionally as we need to provide the full seconds even on 32bit for the upcoming y2038 support. We still have to truncate it for the current 32bit interface, but the core code can be made ready now. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Sun, 9 Sep 2018, Thomas Gleixner wrote: > #ifdef BUILD_VDSO32_64 > typedef u64 gtod_long_t; > #else > typedef unsigned long gtod_long_t; > #endif > + > +struct vgtod_ts { > + gtod_long_t sec; and actually this wants to become u64 unconditionally as we need to provide the full seconds even on 32bit for the upcoming y2038 support. We still have to truncate it for the current 32bit interface, but the core code can be made ready now. Thanks, tglx
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Fri, 31 Aug 2018, Andy Lutomirski wrote: > (Hi, Florian!) > > On Fri, Aug 31, 2018 at 6:59 PM, Matt Rickard wrote: > > Process clock_gettime(CLOCK_TAI) in vDSO. > > This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: > > > > nanoseconds > > before after clockname > > - - > > 23387 CLOCK_TAI > > 9693 CLOCK_REALTIME > > 8887 CLOCK_MONOTONIC > > Are you sure you did this right? With the clocksource set to TSC > (which is the only reasonable choice unless KVM has seriously cleaned > up its act), with retpolines enabled, I get 24ns for CLOCK_MONOTONIC > without your patch and 32ns with your patch. And there is indeed a > retpoline in the disassembled output: > > e5: e8 07 00 00 00 callq f1 <__vdso_clock_gettime+0x31> > ea: f3 90 pause > ec: 0f ae e8lfence > ef: eb f9 jmpea <__vdso_clock_gettime+0x2a> > f1: 48 89 04 24 mov%rax,(%rsp) > f5: c3 retq > > You're probably going to have to set -fno-jump-tables or do something > clever like adding a whole array of (seconds, nsec) in gtod and > indexing that array by the clock id. See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Thanks, tglx --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -203,39 +203,18 @@ notrace static inline u64 vgetsns(int *m return v * gtod->mult; } -/* Code size doesn't matter (vdso is 4k anyway) and this is faster. */ -notrace static int __always_inline do_realtime(struct timespec *ts) +notrace static int __always_inline do_hres(struct timespec *ts, clockid_t clk) { - unsigned long seq; - u64 ns; + struct vgtod_ts *base = >basetime[clk & VGTOD_HRES_MASK]; + unsigned int seq; int mode; - - do { - seq = gtod_read_begin(gtod); - mode = gtod->vclock_mode; - ts->tv_sec = gtod->wall_time_sec; - ns = gtod->wall_time_snsec; - ns += vgetsns(); - ns >>= gtod->shift; - } while (unlikely(gtod_read_retry(gtod, seq))); - - ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, ); - ts->tv_nsec = ns; - - return mode; -} - -notrace static int __always_inline do_monotonic(struct timespec *ts) -{ - unsigned long seq; u64 ns; - int mode; do { seq = gtod_read_begin(gtod); mode = gtod->vclock_mode; - ts->tv_sec = gtod->monotonic_time_sec; - ns = gtod->monotonic_time_snsec; + ts->tv_sec = base->sec; + ns = base->nsec; ns += vgetsns(); ns >>= gtod->shift; } while (unlikely(gtod_read_retry(gtod, seq))); @@ -246,58 +225,50 @@ notrace static int __always_inline do_mo return mode; } -notrace static void do_realtime_coarse(struct timespec *ts) +notrace static void do_coarse(struct timespec *ts, clockid_t clk) { + struct vgtod_ts *base = >basetime[clk]; unsigned long seq; - do { - seq = gtod_read_begin(gtod); - ts->tv_sec = gtod->wall_time_coarse_sec; - ts->tv_nsec = gtod->wall_time_coarse_nsec; - } while (unlikely(gtod_read_retry(gtod, seq))); -} -notrace static void do_monotonic_coarse(struct timespec *ts) -{ - unsigned long seq; do { seq = gtod_read_begin(gtod); - ts->tv_sec = gtod->monotonic_time_coarse_sec; - ts->tv_nsec = gtod->monotonic_time_coarse_nsec; + ts->tv_sec = base->sec; + ts->tv_nsec = base->nsec; } while (unlikely(gtod_read_retry(gtod, seq))); } notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) { - switch (clock) { - case CLOCK_REALTIME: - if (do_realtime(ts) == VCLOCK_NONE) - goto fallback; - break; - case CLOCK_MONOTONIC: - if (do_monotonic(ts) == VCLOCK_NONE) - goto fallback; - break; - case CLOCK_REALTIME_COARSE: - do_realtime_coarse(ts); - break; - case CLOCK_MONOTONIC_COARSE: - do_monotonic_coarse(ts); - break; - default: - goto fallback; - } + unsigned int msk; - return 0; -fallback: + /* Sort out negative (CPU/FD) and invalid clocks */ + if (unlikely((unsigned int) clock >= MAX_CLOCKS)) + return vdso_fallback_gettime(clock, ts); + + /*
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Fri, 31 Aug 2018, Andy Lutomirski wrote: > (Hi, Florian!) > > On Fri, Aug 31, 2018 at 6:59 PM, Matt Rickard wrote: > > Process clock_gettime(CLOCK_TAI) in vDSO. > > This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: > > > > nanoseconds > > before after clockname > > - - > > 23387 CLOCK_TAI > > 9693 CLOCK_REALTIME > > 8887 CLOCK_MONOTONIC > > Are you sure you did this right? With the clocksource set to TSC > (which is the only reasonable choice unless KVM has seriously cleaned > up its act), with retpolines enabled, I get 24ns for CLOCK_MONOTONIC > without your patch and 32ns with your patch. And there is indeed a > retpoline in the disassembled output: > > e5: e8 07 00 00 00 callq f1 <__vdso_clock_gettime+0x31> > ea: f3 90 pause > ec: 0f ae e8lfence > ef: eb f9 jmpea <__vdso_clock_gettime+0x2a> > f1: 48 89 04 24 mov%rax,(%rsp) > f5: c3 retq > > You're probably going to have to set -fno-jump-tables or do something > clever like adding a whole array of (seconds, nsec) in gtod and > indexing that array by the clock id. See the patch below. It's integrating TAI without slowing down everything and it definitely does not result in indirect calls. On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends on the uarch and I also observed compiler version dependend variations. Thanks, tglx --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -203,39 +203,18 @@ notrace static inline u64 vgetsns(int *m return v * gtod->mult; } -/* Code size doesn't matter (vdso is 4k anyway) and this is faster. */ -notrace static int __always_inline do_realtime(struct timespec *ts) +notrace static int __always_inline do_hres(struct timespec *ts, clockid_t clk) { - unsigned long seq; - u64 ns; + struct vgtod_ts *base = >basetime[clk & VGTOD_HRES_MASK]; + unsigned int seq; int mode; - - do { - seq = gtod_read_begin(gtod); - mode = gtod->vclock_mode; - ts->tv_sec = gtod->wall_time_sec; - ns = gtod->wall_time_snsec; - ns += vgetsns(); - ns >>= gtod->shift; - } while (unlikely(gtod_read_retry(gtod, seq))); - - ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, ); - ts->tv_nsec = ns; - - return mode; -} - -notrace static int __always_inline do_monotonic(struct timespec *ts) -{ - unsigned long seq; u64 ns; - int mode; do { seq = gtod_read_begin(gtod); mode = gtod->vclock_mode; - ts->tv_sec = gtod->monotonic_time_sec; - ns = gtod->monotonic_time_snsec; + ts->tv_sec = base->sec; + ns = base->nsec; ns += vgetsns(); ns >>= gtod->shift; } while (unlikely(gtod_read_retry(gtod, seq))); @@ -246,58 +225,50 @@ notrace static int __always_inline do_mo return mode; } -notrace static void do_realtime_coarse(struct timespec *ts) +notrace static void do_coarse(struct timespec *ts, clockid_t clk) { + struct vgtod_ts *base = >basetime[clk]; unsigned long seq; - do { - seq = gtod_read_begin(gtod); - ts->tv_sec = gtod->wall_time_coarse_sec; - ts->tv_nsec = gtod->wall_time_coarse_nsec; - } while (unlikely(gtod_read_retry(gtod, seq))); -} -notrace static void do_monotonic_coarse(struct timespec *ts) -{ - unsigned long seq; do { seq = gtod_read_begin(gtod); - ts->tv_sec = gtod->monotonic_time_coarse_sec; - ts->tv_nsec = gtod->monotonic_time_coarse_nsec; + ts->tv_sec = base->sec; + ts->tv_nsec = base->nsec; } while (unlikely(gtod_read_retry(gtod, seq))); } notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) { - switch (clock) { - case CLOCK_REALTIME: - if (do_realtime(ts) == VCLOCK_NONE) - goto fallback; - break; - case CLOCK_MONOTONIC: - if (do_monotonic(ts) == VCLOCK_NONE) - goto fallback; - break; - case CLOCK_REALTIME_COARSE: - do_realtime_coarse(ts); - break; - case CLOCK_MONOTONIC_COARSE: - do_monotonic_coarse(ts); - break; - default: - goto fallback; - } + unsigned int msk; - return 0; -fallback: + /* Sort out negative (CPU/FD) and invalid clocks */ + if (unlikely((unsigned int) clock >= MAX_CLOCKS)) + return vdso_fallback_gettime(clock, ts); + + /*
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Sat, Sep 1, 2018 at 2:33 AM, Florian Weimer wrote: > On 09/01/2018 05:39 AM, Andy Lutomirski wrote: >> >> Florian, do you think >> glibc would be willing to add some magic to turn >> clock_gettime(CLOCK_MONOTONIC, t) into >> __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? > > > What's the goal here? Turn the indirect call/conditional jump/indirect call > sequence into a single indirect call, purely for performance reasons? Almost. It's to bypass some of the branches in __vdso_clock_gettime(), which is supposed to be very fast. AFAIK most user code that uses clock_gettime() passes a constant for the first argument, and we can squeeze out some performance by optimizing that case. The indirect branches internal to the vDSO are a separate issue and should be solved separately. (It's really too bad that x86 doesn't have a 64-bit call instruction. If it did, then the PLT could get rewritten at dynamic link time to avoid indirect calls entirely, and presumably glibc could use the same technique to call into the vDSO without indirect calls.)
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On Sat, Sep 1, 2018 at 2:33 AM, Florian Weimer wrote: > On 09/01/2018 05:39 AM, Andy Lutomirski wrote: >> >> Florian, do you think >> glibc would be willing to add some magic to turn >> clock_gettime(CLOCK_MONOTONIC, t) into >> __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? > > > What's the goal here? Turn the indirect call/conditional jump/indirect call > sequence into a single indirect call, purely for performance reasons? Almost. It's to bypass some of the branches in __vdso_clock_gettime(), which is supposed to be very fast. AFAIK most user code that uses clock_gettime() passes a constant for the first argument, and we can squeeze out some performance by optimizing that case. The indirect branches internal to the vDSO are a separate issue and should be solved separately. (It's really too bad that x86 doesn't have a 64-bit call instruction. If it did, then the PLT could get rewritten at dynamic link time to avoid indirect calls entirely, and presumably glibc could use the same technique to call into the vDSO without indirect calls.)
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/01/2018 05:39 AM, Andy Lutomirski wrote: Florian, do you think glibc would be willing to add some magic to turn clock_gettime(CLOCK_MONOTONIC, t) into __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? What's the goal here? Turn the indirect call/conditional jump/indirect call sequence into a single indirect call, purely for performance reasons? Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
On 09/01/2018 05:39 AM, Andy Lutomirski wrote: Florian, do you think glibc would be willing to add some magic to turn clock_gettime(CLOCK_MONOTONIC, t) into __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? What's the goal here? Turn the indirect call/conditional jump/indirect call sequence into a single indirect call, purely for performance reasons? Thanks, Florian
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
(Hi, Florian!) On Fri, Aug 31, 2018 at 6:59 PM, Matt Rickard wrote: > Process clock_gettime(CLOCK_TAI) in vDSO. > This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: > > nanoseconds > before after clockname > - - > 23387 CLOCK_TAI > 9693 CLOCK_REALTIME > 8887 CLOCK_MONOTONIC Are you sure you did this right? With the clocksource set to TSC (which is the only reasonable choice unless KVM has seriously cleaned up its act), with retpolines enabled, I get 24ns for CLOCK_MONOTONIC without your patch and 32ns with your patch. And there is indeed a retpoline in the disassembled output: e5: e8 07 00 00 00 callq f1 <__vdso_clock_gettime+0x31> ea: f3 90 pause ec: 0f ae e8lfence ef: eb f9 jmpea <__vdso_clock_gettime+0x2a> f1: 48 89 04 24 mov%rax,(%rsp) f5: c3 retq You're probably going to have to set -fno-jump-tables or do something clever like adding a whole array of (seconds, nsec) in gtod and indexing that array by the clock id. Meanwhile, I wrote the following trivial patch to add a __vdso_clock_gettime_monotonic export. It runs in 21ns, and I suspect that the speedup is even a bit bigger when cache-cold because it avoids some branches. What do you all think? Florian, do you think glibc would be willing to add some magic to turn clock_gettime(CLOCK_MONOTONIC, t) into __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index 91ed1bb2a3bb..4f22e9cb97a5 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -319,6 +319,14 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) int clock_gettime(clockid_t, struct timespec *) __attribute__((weak, alias("__vdso_clock_gettime"))); +notrace int __vdso_clock_gettime_monotonic(struct timespec *ts) +{ + if (likely(do_monotonic(ts) != VCLOCK_NONE)) + return 0; + + return vdso_fallback_gettime(CLOCK_MONOTONIC, ts); +} + notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) { if (likely(tv != NULL)) { diff --git a/arch/x86/entry/vdso/vdso.lds.S b/arch/x86/entry/vdso/vdso.lds.S index d3a2dce4cfa9..28e23cbc02c9 100644 --- a/arch/x86/entry/vdso/vdso.lds.S +++ b/arch/x86/entry/vdso/vdso.lds.S @@ -15,6 +15,11 @@ * This controls what userland symbols we export from the vDSO. */ VERSION { + LINUX_4.19 { + global: + __vdso_clock_gettime_monotonic; + }; + LINUX_2.6 { global: clock_gettime;
Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
(Hi, Florian!) On Fri, Aug 31, 2018 at 6:59 PM, Matt Rickard wrote: > Process clock_gettime(CLOCK_TAI) in vDSO. > This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: > > nanoseconds > before after clockname > - - > 23387 CLOCK_TAI > 9693 CLOCK_REALTIME > 8887 CLOCK_MONOTONIC Are you sure you did this right? With the clocksource set to TSC (which is the only reasonable choice unless KVM has seriously cleaned up its act), with retpolines enabled, I get 24ns for CLOCK_MONOTONIC without your patch and 32ns with your patch. And there is indeed a retpoline in the disassembled output: e5: e8 07 00 00 00 callq f1 <__vdso_clock_gettime+0x31> ea: f3 90 pause ec: 0f ae e8lfence ef: eb f9 jmpea <__vdso_clock_gettime+0x2a> f1: 48 89 04 24 mov%rax,(%rsp) f5: c3 retq You're probably going to have to set -fno-jump-tables or do something clever like adding a whole array of (seconds, nsec) in gtod and indexing that array by the clock id. Meanwhile, I wrote the following trivial patch to add a __vdso_clock_gettime_monotonic export. It runs in 21ns, and I suspect that the speedup is even a bit bigger when cache-cold because it avoids some branches. What do you all think? Florian, do you think glibc would be willing to add some magic to turn clock_gettime(CLOCK_MONOTONIC, t) into __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant? diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index 91ed1bb2a3bb..4f22e9cb97a5 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -319,6 +319,14 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) int clock_gettime(clockid_t, struct timespec *) __attribute__((weak, alias("__vdso_clock_gettime"))); +notrace int __vdso_clock_gettime_monotonic(struct timespec *ts) +{ + if (likely(do_monotonic(ts) != VCLOCK_NONE)) + return 0; + + return vdso_fallback_gettime(CLOCK_MONOTONIC, ts); +} + notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) { if (likely(tv != NULL)) { diff --git a/arch/x86/entry/vdso/vdso.lds.S b/arch/x86/entry/vdso/vdso.lds.S index d3a2dce4cfa9..28e23cbc02c9 100644 --- a/arch/x86/entry/vdso/vdso.lds.S +++ b/arch/x86/entry/vdso/vdso.lds.S @@ -15,6 +15,11 @@ * This controls what userland symbols we export from the vDSO. */ VERSION { + LINUX_4.19 { + global: + __vdso_clock_gettime_monotonic; + }; + LINUX_2.6 { global: clock_gettime;
[PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
Process clock_gettime(CLOCK_TAI) in vDSO. This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: nanoseconds before after clockname - - 23387 CLOCK_TAI 9693 CLOCK_REALTIME 8887 CLOCK_MONOTONIC Signed-off-by: Matt Rickard --- arch/x86/entry/vdso/vclock_gettime.c| 25 + arch/x86/entry/vsyscall/vsyscall_gtod.c | 2 ++ arch/x86/include/asm/vgtod.h| 1 + 3 files changed, 28 insertions(+) diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index f19856d95c60..91ed1bb2a3bb 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -246,6 +246,27 @@ notrace static int __always_inline do_monotonic(struct timespec *ts) return mode; } +notrace static int __always_inline do_tai(struct timespec *ts) +{ + unsigned long seq; + u64 ns; + int mode; + + do { + seq = gtod_read_begin(gtod); + mode = gtod->vclock_mode; + ts->tv_sec = gtod->tai_time_sec; + ns = gtod->wall_time_snsec; + ns += vgetsns(); + ns >>= gtod->shift; + } while (unlikely(gtod_read_retry(gtod, seq))); + + ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, ); + ts->tv_nsec = ns; + + return mode; +} + notrace static void do_realtime_coarse(struct timespec *ts) { unsigned long seq; @@ -277,6 +298,10 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) if (do_monotonic(ts) == VCLOCK_NONE) goto fallback; break; + case CLOCK_TAI: + if (do_tai(ts) == VCLOCK_NONE) + goto fallback; + break; case CLOCK_REALTIME_COARSE: do_realtime_coarse(ts); break; diff --git a/arch/x86/entry/vsyscall/vsyscall_gtod.c b/arch/x86/entry/vsyscall/vsyscall_gtod.c index e1216dd95c04..d61392fe17f6 100644 --- a/arch/x86/entry/vsyscall/vsyscall_gtod.c +++ b/arch/x86/entry/vsyscall/vsyscall_gtod.c @@ -53,6 +53,8 @@ void update_vsyscall(struct timekeeper *tk) vdata->monotonic_time_snsec = tk->tkr_mono.xtime_nsec + ((u64)tk->wall_to_monotonic.tv_nsec << tk->tkr_mono.shift); + vdata->tai_time_sec = tk->xtime_sec + + tk->tai_offset; while (vdata->monotonic_time_snsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) { vdata->monotonic_time_snsec -= diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..adc9f7b20b9c 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -32,6 +32,7 @@ struct vsyscall_gtod_data { gtod_long_t wall_time_coarse_nsec; gtod_long_t monotonic_time_coarse_sec; gtod_long_t monotonic_time_coarse_nsec; + gtod_long_t tai_time_sec; int tz_minuteswest; int tz_dsttime;
[PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO
Process clock_gettime(CLOCK_TAI) in vDSO. This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC: nanoseconds before after clockname - - 23387 CLOCK_TAI 9693 CLOCK_REALTIME 8887 CLOCK_MONOTONIC Signed-off-by: Matt Rickard --- arch/x86/entry/vdso/vclock_gettime.c| 25 + arch/x86/entry/vsyscall/vsyscall_gtod.c | 2 ++ arch/x86/include/asm/vgtod.h| 1 + 3 files changed, 28 insertions(+) diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index f19856d95c60..91ed1bb2a3bb 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -246,6 +246,27 @@ notrace static int __always_inline do_monotonic(struct timespec *ts) return mode; } +notrace static int __always_inline do_tai(struct timespec *ts) +{ + unsigned long seq; + u64 ns; + int mode; + + do { + seq = gtod_read_begin(gtod); + mode = gtod->vclock_mode; + ts->tv_sec = gtod->tai_time_sec; + ns = gtod->wall_time_snsec; + ns += vgetsns(); + ns >>= gtod->shift; + } while (unlikely(gtod_read_retry(gtod, seq))); + + ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, ); + ts->tv_nsec = ns; + + return mode; +} + notrace static void do_realtime_coarse(struct timespec *ts) { unsigned long seq; @@ -277,6 +298,10 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) if (do_monotonic(ts) == VCLOCK_NONE) goto fallback; break; + case CLOCK_TAI: + if (do_tai(ts) == VCLOCK_NONE) + goto fallback; + break; case CLOCK_REALTIME_COARSE: do_realtime_coarse(ts); break; diff --git a/arch/x86/entry/vsyscall/vsyscall_gtod.c b/arch/x86/entry/vsyscall/vsyscall_gtod.c index e1216dd95c04..d61392fe17f6 100644 --- a/arch/x86/entry/vsyscall/vsyscall_gtod.c +++ b/arch/x86/entry/vsyscall/vsyscall_gtod.c @@ -53,6 +53,8 @@ void update_vsyscall(struct timekeeper *tk) vdata->monotonic_time_snsec = tk->tkr_mono.xtime_nsec + ((u64)tk->wall_to_monotonic.tv_nsec << tk->tkr_mono.shift); + vdata->tai_time_sec = tk->xtime_sec + + tk->tai_offset; while (vdata->monotonic_time_snsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) { vdata->monotonic_time_snsec -= diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..adc9f7b20b9c 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -32,6 +32,7 @@ struct vsyscall_gtod_data { gtod_long_t wall_time_coarse_nsec; gtod_long_t monotonic_time_coarse_sec; gtod_long_t monotonic_time_coarse_nsec; + gtod_long_t tai_time_sec; int tz_minuteswest; int tz_dsttime;