Re: TIMESPEC_TO_NSEC(), futex(2) & __thrsleep(2)

Scott Cheloha Mon, 06 Jan 2020 15:22:21 -0800

Sorry for the delay.  I needed to think on this.

On Fri, Jan 03, 2020 at 11:15:20AM +0100, Martin Pieuchot wrote:
> On 02/01/20(Thu) 14:29, Scott Cheloha wrote:
> > > On Jan 2, 2020, at 9:02 AM, Martin Pieuchot <[email protected]> wrote:
> > > 
> > > On 01/01/20(Wed) 22:13, Scott Cheloha wrote:
> > >>> On Dec 31, 2019, at 9:35 AM, Martin Pieuchot <[email protected]> wrote:
> > >>> 
> > >>> I'd like to stop converting the given timespec to ticks and instead use
> > >>> nanoseconds.  This is part of the ongoing effort to reduce the use of
> > >>> `hz' through the kernel.
> > >>> 
> > >>> Since I don't know C I'd appreciate any pointer about the checks that
> > >>> should be added to TIMESPEC_TO_NSEC().
> > >>> 
> > >>> Then the conversions to {t,rw}sleep_nsec(9) become trivial, diff below.
> > >> 
> > >> We can't do this until timeouts have a tickless interface.  Otherwise
> > >> your timeouts will return early.  That's why I was saving the sys/kern
> > >> conversions until after resolving that issue.
> > > 
> > > I don't understand, can you elaborate?
> > 
> > Timeout are scheduled against the current value of "ticks".  Any time that
> > has elapsed since the current tick began is unaccounted for.  You need to
> > add a tick to your sleep to account for it.  tstohz(9) does this.  We don't
> > do it automatically for the *sleep_nsec(9) interfaces because that would
> > have complicated the conversions we're doing and probably broken callers
> > before we were ready to break them.
> 
> I question the argument that would complicate the conversions.  Isn't it
> just a margin of error that is given by the precision of the conversion?


It depends on the sensitivity of the subsystem.  Some timeouts are
"fat" and nobody will notice if they are ~10ms "late".  Others are
razor thin and the system will be palpably slower for an interactive
user.  Most are somewhere in between.

But the risk is there.  So I don't want to change the rounding yet.
I only want to focus on getting finer-grained timeout information to
the timeout layer.  The more detail we can get to the timeout layer
the better it can make decisions.  So in this sweep of conversions I'm
only focused on getting the math right.

> Here it is 1 tick, generally ~10ms.  So either the code work with  a
> sleep of 10ms less or more.  Generally it doesn't matter.  Now for
> userland facing programs you shouldn't wakeup 10ms earlier.
> 
> I don't understand why the rounding precision is different between the
> two interfaces.  We have an interface that adds one tick and one that
> doesn't.

It's different because no other callers using tsleep(9) cared about
the loss of up to 10ms.  So tstohz(9)/tvtohz(9) were added as bandaids.
They actually work pretty well, all things considered.

> Both choices are imprecise and should disappear if/when the
> guts of the sleep are modified to be tickless (whatever that means).
> In the meantime I'd suggest we keep the same behavior between the two
> interfaces so we can move forward with the other part of the problem:
> the conversion.

I agree.  I don't think we should change the rounding behavior (yet)
while we still have many trickier conversions ahead of us.

In this context, tickless timeouts are timeouts that preserve the
granularity of the timeout in a constant quantum, e.g. nanoseconds.
Said another way, timeouts are tickless if they expire at an absolute
time on one of the system clocks, e.g. "ten seconds and 400
nanoseconds after boot" or "Jan 10 2020 16:30 UTC".

> It is like doing a refactoring with introducing a behavior change that
> prevent us from finishing the refactoring because the change depends on
> the internals...  Am I going in circle?

You're right, there is a chicken-and-egg problem in this work.  Some
timeouts work currently on the tick-based backend but may break when
we change to a tickless backend.  But they may break if we preemptively
convert them to use a real unit of time before we change the backend.

The sys/kern syscall timeouts are one such group.  others.  The
trickiest ones in sys/kern are the per-process itimers.

There might be other groups.

> PS: What about architectures that won't go tickless?  How are we going
> to deal with the conversions if there's more than one API?

If we can get the tickless timeout(9) backend working on the major
architectures then it will be in use on all architectures.  There
won't be two APIs.

If you're asking about architectures that we fail to implement a
dynamic hardclock(9) for, I don't know.  I think they'll just continue
to have a static hardclock(9).

Re: TIMESPEC_TO_NSEC(), futex(2) & __thrsleep(2)

Reply via email to