Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-22 Thread Jilles Tjoelker
On Wed, Dec 19, 2012 at 11:00:06AM +, Poul-Henning Kamp wrote:
 
 In message 50d192e8.3020...@freebsd.org, Alexander Motin writes:

 Linux uses 32.32 format in their eventtimers code.

 (And that is no accident, I know who they got the number from :-)

 But if at some point we want to be able to 
 handle absolute wall time, [...]

 Then you have other problems, including but not limited to clock
 being stepped, leap-seconds, suspend/resume and frequency stability.

 If you want to support callouts of the type (At 14:00 UTC tomorrow)
 (disregarding the time-zone issue), you need to catch all significant
 changes to our UTC estimate and recalibrate your callout based on
 that.

 It is not obvious that we have applications for such an API that
 warrant the complexity.

 Either way, such a facility should be layered on top of the callout
 facility, which should always run in elapsed time[1] with no attention
 paid to what NTPD might do to the UTC estimate.

POSIX specifies functions that assume such a facility exists, although
applications may not care much if we implement them incorrectly.

For example, pthread_mutex_timedlock() and sem_timedwait() shall time
out when the CLOCK_REALTIME clock reaches the given value, and
pthread_cond_timedwait() and clock_nanosleep() (with TIMER_ABSTIME)
shall time out when the specified clock reaches the given value.

 So summary: 32.32 is the right format.

 Poul-Henning

 [1] Notice that elapsed time needs a firm definition with respect
 to suspend/resume, and that this decision has big implications
 for the API use and code duplication.

 I think it prudent to specify a flag to callouts, to tell what
 should happen on suspend/resume, something like:

   SR_CANCEL   /* Cancel the callout on S/R */
   /* no flag* /* Toll this callout only when system is running */
   SR_IGNORE   /* Toll suspended time from callout */

 If you get this right, callouts from device drivers will just DTRT,
 if you get it wrong, all device drivers will need boilerplate code
 to handle S/R

Userland could get access to this via CLOCK_REALTIME vs CLOCK_MONOTONIC
vs CLOCK_UPTIME.

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-22 Thread Poul-Henning Kamp

In message 2012125025.ga46...@stack.nl, Jilles Tjoelker writes:

 Either way, such a facility should be layered on top of the callout
 facility, which should always run in elapsed time[1] with no attention
 paid to what NTPD might do to the UTC estimate.

POSIX specifies functions that assume such a facility exists, although
applications may not care much if we implement them incorrectly.

It should still be implemented op top of callouts, not as part of:
it is an entirely different thing to try to do right.

 I think it prudent to specify a flag to callouts, to tell what
 should happen on suspend/resume, something like:

  SR_CANCEL   /* Cancel the callout on S/R */
  /* no flag* /* Toll this callout only when system is running */
  SR_IGNORE   /* Toll suspended time from callout */

 If you get this right, callouts from device drivers will just DTRT,
 if you get it wrong, all device drivers will need boilerplate code
 to handle S/R

Userland could get access to this via CLOCK_REALTIME vs CLOCK_MONOTONIC
vs CLOCK_UPTIME.

I have _no_ idea what you are trying to say here...

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Alexander Motin

On 19.12.2012 01:37, Ian Lepore wrote:

On Wed, 2012-12-19 at 00:29 +0100, Luigi Rizzo wrote:

On Tue, Dec 18, 2012 at 04:27:45PM -0700, Ian Lepore wrote:

On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:

[top posting for readability;
in summary we were discussing the new callout API trying to avoid
an explosion of methods and arguments while at the same time
supporting the old API and the new one]
(I am also Cc-ing phk as he might have better insight
on the topic).

I think the patch you propose is a step in the right direction,
but i still remain concerned by having to pass two bintimes
(by reference, but they should really go by value)
and one 'ticks' value to all these functions.

I am also dubious that we need a full 128 bits to specify
the 'precision': there would be absolutely no loss of functionality
if we decided to specify the precision in powers of 2, so a precision
'k' (signed) means 2^k seconds. This way 8 bits are enough to
represent any precision we want.


...

I'm not so sure about the 2^k precision.  You speak of seconds, but I
would be worrying about sub-second precision in my work.  It would
typical to want a 500uS timeout but be willing to late by up to 250uS if


i said k is signed so negative values represent fractions of a
second. 2^-128 is pretty short :)


Ahh, I missed that.  Good enough then!  Hmmm, if that ideas survives
further review, then could precision be encoded in 8 bits of the flags,
eliminating another parm?


Those who tracked the branch could see that actually was our first 
approach to handle precision. Unfortunately, it appeared not very 
convenient when you need to convert relative precision in percents or 
fraction of interval to the absolute precision in power of 2. We were 
worried that using some ffsll() for it can be inconvenient or expensive. 
But since we are now talking about passing relative bintime as an 
argument, that may be more viable option. I'll make another try.


Thanks for the input. Pity it didn't happen couple of months ago.

--
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 1355873265.1198.183.ca...@revolution.hippie.lan, Ian Lepore writes
:
On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:

I'm not so sure about the 2^k precision.  You speak of seconds, but I
would be worrying about sub-second precision in my work.

It is a bad idea, and it is physically pointless, given the stabilities
of the timebases available for computers in general.

Please just take my word as a time-nut, and use a 32.32 binary format
in seconds (see previous email) and be done with it.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Davide Italiano
On Wed, Dec 19, 2012 at 1:54 AM, Poul-Henning Kamp p...@phk.freebsd.dk wrote:
 
 In message 1355873265.1198.183.ca...@revolution.hippie.lan, Ian Lepore 
 writes
 :
On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:

I'm not so sure about the 2^k precision.  You speak of seconds, but I
would be worrying about sub-second precision in my work.

 It is a bad idea, and it is physically pointless, given the stabilities
 of the timebases available for computers in general.

 Please just take my word as a time-nut, and use a 32.32 binary format
 in seconds (see previous email) and be done with it.

 --
 Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
 p...@freebsd.org | TCP/IP since RFC 956
 FreeBSD committer   | BSD since 4.3-tahoe
 Never attribute to malice what can adequately be explained by incompetence.

Right now -- the precision is specified in 'bintime', which is a binary number.
It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
the specific platform.
I do not really think it worth to create another structure for
handling time (e.g. struct bintime32), as it will lead to code
duplication for all the basic conversion/math operation. On the other
hand, 32.32 may not be enough in the long future.
What do you think about that?

Thanks,

Davide
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Alexander Motin

On 19.12.2012 12:03, Davide Italiano wrote:

On Wed, Dec 19, 2012 at 1:54 AM, Poul-Henning Kamp p...@phk.freebsd.dk wrote:


In message 1355873265.1198.183.ca...@revolution.hippie.lan, Ian Lepore writes
:

On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:



I'm not so sure about the 2^k precision.  You speak of seconds, but I
would be worrying about sub-second precision in my work.


It is a bad idea, and it is physically pointless, given the stabilities
of the timebases available for computers in general.

Please just take my word as a time-nut, and use a 32.32 binary format
in seconds (see previous email) and be done with it.


Right now -- the precision is specified in 'bintime', which is a binary number.
It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
the specific platform.
I do not really think it worth to create another structure for
handling time (e.g. struct bintime32), as it will lead to code
duplication for all the basic conversion/math operation. On the other
hand, 32.32 may not be enough in the long future.
What do you think about that?


Linux uses 32.32 format in their eventtimers code. Respecting that now 
we have no any timer hardware with frequency about 4GHz, that precision 
is probably sufficient. But if at some point we want to be able to 
handle absolute wall time, then 32bit integer part may be not enough. 
Then we return to the question: how many different data types do we 
want to see in one subsystem? Sure, using single 64bit value would be 
much easier then struct bintime from many perspectives, but what's about 
the edge cases?


--
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message CACYV=-eg542ihm9kfujpvczzra4tqepebva8rzt1yohncgf...@mail.gmail.com
, Davide Italiano writes:

Right now -- the precision is specified in 'bintime', which is a binary number.
It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
the specific platform.

And that is way overkill for specifying a callout, at best your clock
has short term stabilities approaching 1e-8, but likely as bad as 1e-6.

(The reason why bintime is important for timekeeping is that we
accumulate timeintervals approx 1e3 times a second, so the rounding
error has to be much smaller than the short term stability in order
to not dominate)

I do not really think it worth to create another structure for
handling time (e.g. struct bintime32), as it will lead to code

No, that was exactly my point:  It should be an integer so that
comparisons and arithmetic is trivial.   A 32.32 format fits
nicely into a int64_t which is readily available in the language.

As I said in my previous email:


typedef dur_t   int64_t;/* signed for bug catching */
#define DURSEC  ((dur_t)1  32)
#define DURMIN  (DURSEC * 60)
#define DURMSEC (DURSEC / 1000)
#define DURUSEC (DURSEC / 1000)
#define DURNSEC (DURSEC / 100)

(Bikeshed the names at your convenience)

Then you can say

callout_foo(34 * DURSEC)
callout_foo(2400 * DURMSEC)
or 
callout_foo(500 * DURNSEC)

With this format you can specify callouts 68 years into the future
with quarter nanosecond resolution, and you can trivially and
efficiently compare dur_t's with
if (d1  d2)

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 50d192e8.3020...@freebsd.org, Alexander Motin writes:

Linux uses 32.32 format in their eventtimers code.

(And that is no accident, I know who they got the number from :-)

But if at some point we want to be able to 
handle absolute wall time, [...]

Then you have other problems, including but not limited to clock
being stepped, leap-seconds, suspend/resume and frequency stability.

If you want to support callouts of the type (At 14:00 UTC tomorrow)
(disregarding the time-zone issue), you need to catch all significant
changes to our UTC estimate and recalibrate your callout based on
that.

It is not obvious that we have applications for such an API that
warrant the complexity.

Either way, such a facility should be layered on top of the callout
facility, which should always run in elapsed time[1] with no attention
paid to what NTPD might do to the UTC estimate.

So summary: 32.32 is the right format.

Poul-Henning

[1] Notice that elapsed time needs a firm definition with respect
to suspend/resume, and that this decision has big implications
for the API use and code duplication.

I think it prudent to specify a flag to callouts, to tell what
should happen on suspend/resume, something like:

SR_CANCEL   /* Cancel the callout on S/R */
/* no flag* /* Toll this callout only when system is running */
SR_IGNORE   /* Toll suspended time from callout */

If you get this right, callouts from device drivers will just DTRT,
if you get it wrong, all device drivers will need boilerplate code
to handle S/R

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Bruce Evans

On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:



In message CACYV=-eg542ihm9kfujpvczzra4tqepebva8rzt1yohncgf...@mail.gmail.com
, Davide Italiano writes:


Right now -- the precision is specified in 'bintime', which is a binary number.
It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
the specific platform.


And that is way overkill for specifying a callout, at best your clock
has short term stabilities approaching 1e-8, but likely as bad as 1e-6.


So you always agreed with me that bintimes are unsuitable for almost
everything, and especially unsuitable for timeouts? :-)


(The reason why bintime is important for timekeeping is that we
accumulate timeintervals approx 1e3 times a second, so the rounding
error has to be much smaller than the short term stability in order
to not dominate)


bintimes are not unsuitable for timekeeping, but they a painful to use
for other APIs.  You have to either put bintimes in layers in the other
APIs, or convert them to a more suitable format, and there is a problem
placing the conversion at points where it is efficient.  This thread
seems to be mostly about putting the conversion in wrong places.  My
original objection was about using bintimes for almost everything at
the implementation level.


I do not really think it worth to create another structure for
handling time (e.g. struct bintime32), as it will lead to code


No, that was exactly my point:  It should be an integer so that
comparisons and arithmetic is trivial.   A 32.32 format fits
nicely into a int64_t which is readily available in the language.


I would have tried a 32 bit format with a variable named 'ticks'.
Something like:
- ticks = 0.  Same meaning as now.  No changes in ABIs or APIs to use
  this.  The tick period would be constant but for virtual ticks and
  not too small.  hz = 1000 now makes the period too small, and not a
  power of 2.  So make the period 1/128 second.  This gives a 1.24.7
  binary format.  2**24 seconds is 194 days.
- ticks  0.  The 31 value bits are now a cookie (descriptor) referring
  to a bintime or whatever.  This case should rarely be used.  I don't
  like it that a tickless kernel, which is needed mainly for power
  saving, has expanded into complications to support short timeouts
  which should rarely be used.


As I said in my previous email:

   typedef dur_t   int64_t;/* signed for bug catching */
   #define DURSEC  ((dur_t)1  32)
   #define DURMIN  (DURSEC * 60)
   #define DURMSEC (DURSEC / 1000)
   #define DURUSEC (DURSEC / 1000)
   #define DURNSEC (DURSEC / 100)

(Bikeshed the names at your convenience)

Then you can say

callout_foo(34 * DURSEC)
callout_foo(2400 * DURMSEC)
or
callout_foo(500 * DURNSEC)


Constructing the cookie for my special case would not be so easy.


With this format you can specify callouts 68 years into the future
with quarter nanosecond resolution, and you can trivially and
efficiently compare dur_t's with
if (d1  d2)


This would make a better general format than timevals, timespecs and
of course bintimes :-).  It is a bit wasteful for timeouts since
its extremes are rarely used.  Malicious and broken callers can
still cause overflow at 68 years, so you have to check for it and
handle it.  The limit of 194 days is just as good for timeouts.

Bruce
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 20121219221518.e1...@besplex.bde.org, Bruce Evans writes:

 With this format you can specify callouts 68 years into the future
 with quarter nanosecond resolution, and you can trivially and
 efficiently compare dur_t's with
  if (d1  d2)

This would make a better general format than timevals, timespecs and
of course bintimes :-).

Except that for absolute timescales, we're running out of the 32 bits
integer part.

Bintimes is a necessary superset of the 32.32 which tries to work
around the necessary but missing int96_t or int128_t[1].

Poul-Henning


[1] A good addition to C would be a general multi-word integer type
where you could ask for any int%d_t or uint%d_t you cared for, and
have the compiler DTRT.  In difference from using a multiword-library,
this would still give these types their natural integer behaviour.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Davide Italiano
On Wed, Dec 19, 2012 at 4:18 AM, Bruce Evans b...@optusnet.com.au wrote:
 On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:

 
 In message
 CACYV=-eg542ihm9kfujpvczzra4tqepebva8rzt1yohncgf...@mail.gmail.com
 , Davide Italiano writes:

 Right now -- the precision is specified in 'bintime', which is a binary
 number.
 It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
 the specific platform.


 And that is way overkill for specifying a callout, at best your clock
 has short term stabilities approaching 1e-8, but likely as bad as 1e-6.


 So you always agreed with me that bintimes are unsuitable for almost
 everything, and especially unsuitable for timeouts? :-)


 (The reason why bintime is important for timekeeping is that we
 accumulate timeintervals approx 1e3 times a second, so the rounding
 error has to be much smaller than the short term stability in order
 to not dominate)


 bintimes are not unsuitable for timekeeping, but they a painful to use
 for other APIs.  You have to either put bintimes in layers in the other
 APIs, or convert them to a more suitable format, and there is a problem
 placing the conversion at points where it is efficient.  This thread
 seems to be mostly about putting the conversion in wrong places.  My
 original objection was about using bintimes for almost everything at
 the implementation level.


 I do not really think it worth to create another structure for
 handling time (e.g. struct bintime32), as it will lead to code


 No, that was exactly my point:  It should be an integer so that
 comparisons and arithmetic is trivial.   A 32.32 format fits
 nicely into a int64_t which is readily available in the language.


 I would have tried a 32 bit format with a variable named 'ticks'.
 Something like:
 - ticks = 0.  Same meaning as now.  No changes in ABIs or APIs to use
   this.  The tick period would be constant but for virtual ticks and
   not too small.  hz = 1000 now makes the period too small, and not a
   power of 2.  So make the period 1/128 second.  This gives a 1.24.7
   binary format.  2**24 seconds is 194 days.
 - ticks  0.  The 31 value bits are now a cookie (descriptor) referring
   to a bintime or whatever.  This case should rarely be used.  I don't
   like it that a tickless kernel, which is needed mainly for power
   saving, has expanded into complications to support short timeouts
   which should rarely be used.


Bruce, I don't really agree with this.
The data addressed by cookie should be still stored somewhere, and KBI
will result broken. This, indeed, is not real problem as long as
current calloutng code heavily breaks KBI, but if that was your point,
I don't see how your proposed change could help.


 As I said in my previous email:

typedef dur_t   int64_t;/* signed for bug catching */
#define DURSEC  ((dur_t)1  32)
#define DURMIN  (DURSEC * 60)
#define DURMSEC (DURSEC / 1000)
#define DURUSEC (DURSEC / 1000)
#define DURNSEC (DURSEC / 100)

 (Bikeshed the names at your convenience)

 Then you can say

 callout_foo(34 * DURSEC)
 callout_foo(2400 * DURMSEC)
 or
 callout_foo(500 * DURNSEC)


 Constructing the cookie for my special case would not be so easy.


 With this format you can specify callouts 68 years into the future
 with quarter nanosecond resolution, and you can trivially and
 efficiently compare dur_t's with
 if (d1  d2)


 This would make a better general format than timevals, timespecs and
 of course bintimes :-).  It is a bit wasteful for timeouts since
 its extremes are rarely used.  Malicious and broken callers can
 still cause overflow at 68 years, so you have to check for it and
 handle it.  The limit of 194 days is just as good for timeouts.

 Bruce

I think the phk's proposal  is better. About your overflow objection,
I think is really unlikely to happen, but better safe than sorry.

Thanks

Davide
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 20121220005706.i1...@besplex.bde.org, Bruce Evans writes:
On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:

 Except that for absolute timescales, we're running out of the 32 bits
 integer part.

Except 32 bit time_t works until 2106 if it is unsigned.

That's sort of not an option.

The real problem was that time_t was not defined as a floating
point number.

 [1] A good addition to C would be a general multi-word integer type
 where you could ask for any int%d_t or uint%d_t you cared for, and
 have the compiler DTRT.  In difference from using a multiword-library,
 this would still give these types their natural integer behaviour.

That would be convenient, but bad for efficiency if it were actually
used much.

You can say that about anything but CPU-native operations, and I doubt
it would be as inefficient as struct bintime, which does not have access
to the carry bit.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Bruce Evans

On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:



In message 20121219221518.e1...@besplex.bde.org, Bruce Evans writes:


With this format you can specify callouts 68 years into the future
with quarter nanosecond resolution, and you can trivially and
efficiently compare dur_t's with
if (d1  d2)


This would make a better general format than timevals, timespecs and
of course bintimes :-).


Except that for absolute timescales, we're running out of the 32 bits
integer part.


Except 32 bit time_t works until 2106 if it is unsigned.


Bintimes is a necessary superset of the 32.32 which tries to work
around the necessary but missing int96_t or int128_t[1].

[1] A good addition to C would be a general multi-word integer type
where you could ask for any int%d_t or uint%d_t you cared for, and
have the compiler DTRT.  In difference from using a multiword-library,
this would still give these types their natural integer behaviour.


That would be convenient, but bad for efficiency if it were actually
used much.

Bruce
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Luigi Rizzo
On Wed, Dec 19, 2012 at 10:51:48AM +, Poul-Henning Kamp wrote:
 
 In message 
 CACYV=-eg542ihm9kfujpvczzra4tqepebva8rzt1yohncgf...@mail.gmail.com
 , Davide Italiano writes:
 
 Right now -- the precision is specified in 'bintime', which is a binary 
 number.
 It's not 32.32, it's 32.64 or 64.64 depending on the size of time_t in
 the specific platform.
 
 And that is way overkill for specifying a callout, at best your clock
 has short term stabilities approaching 1e-8, but likely as bad as 1e-6.
 
 (The reason why bintime is important for timekeeping is that we
 accumulate timeintervals approx 1e3 times a second, so the rounding
 error has to be much smaller than the short term stability in order
 to not dominate)
 
 I do not really think it worth to create another structure for
 handling time (e.g. struct bintime32), as it will lead to code
 
 No, that was exactly my point:  It should be an integer so that
 comparisons and arithmetic is trivial.   A 32.32 format fits
 nicely into a int64_t which is readily available in the language.
 
 As I said in my previous email:
 
 
 typedef dur_t   int64_t;/* signed for bug catching */
 #define DURSEC  ((dur_t)1  32)
 #define DURMIN  (DURSEC * 60)
 #define DURMSEC (DURSEC / 1000)
 #define DURUSEC (DURSEC / 1000)
 #define DURNSEC (DURSEC / 100)
 
 (Bikeshed the names at your convenience)
 
 Then you can say
 
   callout_foo(34 * DURSEC)
   callout_foo(2400 * DURMSEC)
 or 
   callout_foo(500 * DURNSEC)

only thing, we must be careful with the parentheses

For instance, in your macro, DURNSEC evaluates to 0 and so
does any multiple of it.
We should define them as

#define DURNSEC DURSEC / 100
...

so DURNSEC is still 0 and 500*DURNSEC gives 214

I am curious that Bruce did not mention this :)

(btw the typedef is swapped, should be typedef int64_t dur_t)

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 20121219150809.ga98...@onelab2.iet.unipi.it, Luigi Rizzo writes:

 typedef dur_t   int64_t;/* signed for bug catching */
 #define DURSEC  ((dur_t)1  32)
 #define DURMIN  (DURSEC * 60)
 #define DURMSEC (DURSEC / 1000)
 #define DURUSEC (DURSEC / 1000)
 #define DURNSEC (DURSEC / 100)


only thing, we must be careful with the parentheses

Actually, it's more impportant to be careful with zeros, if you
adjust the above to the correct number of zeros, DURNSEC is 4,
which is within seven percent of the correct value.

(btw the typedef is swapped, should be typedef int64_t dur_t)

Yes, I'm trying to find out of people even listen to me :-)


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Alexander Motin

On 19.12.2012 16:20, Bruce Evans wrote:

On Wed, 19 Dec 2012, Davide Italiano wrote:


On Wed, Dec 19, 2012 at 4:18 AM, Bruce Evans b...@optusnet.com.au
wrote:



I would have tried a 32 bit format with a variable named 'ticks'.
Something like:
- ticks = 0.  Same meaning as now.  No changes in ABIs or APIs to use
  this.  The tick period would be constant but for virtual ticks and
  not too small.  hz = 1000 now makes the period too small, and not a
  power of 2.  So make the period 1/128 second.  This gives a 1.24.7
  binary format.  2**24 seconds is 194 days.
- ticks  0.  The 31 value bits are now a cookie (descriptor) referring
  to a bintime or whatever.  This case should rarely be used.  I don't
  like it that a tickless kernel, which is needed mainly for power
  saving, has expanded into complications to support short timeouts
  which should rarely be used.


Bruce, I don't really agree with this.
The data addressed by cookie should be still stored somewhere, and KBI
will result broken. This, indeed, is not real problem as long as
current calloutng code heavily breaks KBI, but if that was your point,
I don't see how your proposed change could help.


In the old API, it is an error to pass ticks  0, so only broken old
callers are affected.  Of course, if there are any then it would be
hard to detect their garbage cookies.

Anywy, it's too later to change to this, and maybe also to a 32.32
format.


It would be late to change this after committing. I would definitely 
like it to be done earlier to not redo all the tests, but I think we 
could convert callout and eventtimers code to 32.32 format in several 
days. The only two questions are: do we really want it (won't there be 
any reasons to regret about it) and how do we want it to look?


For the first question my personal showstopper since eventtimers 
creation always was the wish to keep consistency. But benefits of 32.32 
format are clear, and if there are more votes for it, let's consider. If 
now it will be decided that full range will never be useful for callout 
subsystem, then it is obviously not needed for eventtimers also.


About the second question, what do you think about such prototypes:

typedef int64_t sbintime_t

static __inline sbintime_t
bintime_shrink(struct bintime *bt)
{}

static __inline struct bintime
bintime_expand(sbintime_t sbt)
{}

...

int
callout_reset_bt(struct callout *, sbintime_t sbt, sbintime_t pr,
void (*fn)(void *), void *arg, int flags);

, where pr used only for absolute precision, and flags includes: direct 
execution, absolute/relative time in argument, relative precision in 
case of relative sbt, flag for aligning to hardclock() to emulate legacy 
behavior, and potentially flags for reaction on suspend/resume.


Another option is to move absolute precision also to flags, using log2() 
representation, as we tried and as was proposed before. With possibility 
to use precise relative time there will be few cases requiring absolute 
value of precision, that should depend on period. Then there will be no 
extra arguments in the most usual cases.


Wrapper for existing API compatibility will look just like this:

#define callout_reset(c, ticks, fn, arg)\
callout_reset_bt(c, ticks2sbintime(ticks), -1,  \
(fn), (arg), C_HARDCLOCK)

--
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Davide Italiano
dropping phk _AT_ onelab2 _DOT_ something from CC as long as it
doesn't seem a valid mail address and I'm annoyed mails bounce back.

On Wed, Dec 19, 2012 at 6:20 AM, Bruce Evans b...@optusnet.com.au wrote:
 On Wed, 19 Dec 2012, Davide Italiano wrote:

 On Wed, Dec 19, 2012 at 4:18 AM, Bruce Evans b...@optusnet.com.au wrote:


 I would have tried a 32 bit format with a variable named 'ticks'.
 Something like:
 - ticks = 0.  Same meaning as now.  No changes in ABIs or APIs to use
   this.  The tick period would be constant but for virtual ticks and
   not too small.  hz = 1000 now makes the period too small, and not a
   power of 2.  So make the period 1/128 second.  This gives a 1.24.7
   binary format.  2**24 seconds is 194 days.
 - ticks  0.  The 31 value bits are now a cookie (descriptor) referring
   to a bintime or whatever.  This case should rarely be used.  I don't
   like it that a tickless kernel, which is needed mainly for power
   saving, has expanded into complications to support short timeouts
   which should rarely be used.


 Bruce, I don't really agree with this.
 The data addressed by cookie should be still stored somewhere, and KBI
 will result broken. This, indeed, is not real problem as long as
 current calloutng code heavily breaks KBI, but if that was your point,
 I don't see how your proposed change could help.


 In the old API, it is an error to pass ticks  0, so only broken old
 callers are affected.  Of course, if there are any then it would be
 hard to detect their garbage cookies.

 Anywy, it's too later to change to this, and maybe also to a 32.32
 format.

 [32.32 format]

It's not too late. What I'd like to do, right now people got
interested in the problem is agreeing on the interface used.
Following this thread, as I've already discussed to mav@, we would
like to decide what of the two is better:
- specify precision as additional argument (as we're doing right now)
- use 'flags' argument
If we allow time argument to be relative and not absolute, as
suggested by luigi@, we can easily use relative precision where we had
to use ffl() before.



 This would make a better general format than timevals, timespecs and
 of course bintimes :-).  It is a bit wasteful for timeouts since
 its extremes are rarely used.  Malicious and broken callers can
 still cause overflow at 68 years, so you have to check for it and
 handle it.  The limit of 194 days is just as good for timeouts.


 I think the phk's proposal  is better. About your overflow objection,
 I think is really unlikely to happen, but better safe than sorry.


 It's very easy for applications to cause kernel overflow using valid
 syscall args like tv_sec = TIME_T_MAX for a relative time in
 nanosleep().  Adding TIME_T_MAX to the current time in seconds overflow
 for all current times except for the first second after the Epoch.
 There is no difference between the overflow for 32-bit and 64-bit
 time_t's for this.  This is now mostly handled so that the behaviour is
 harmless although wrong.  E.g., the timeout might become negative,
 and then since it is not a cookie it is silently replaced by a timeout
 of 1 tick.  In nanosleep(), IIRC there are further overflows that result
 in returning early instead of retrying the 1-tick timeouts endlessly.

 Bruce

Thanks,

Davide
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Bruce Evans

On Wed, 19 Dec 2012, Davide Italiano wrote:


On Wed, Dec 19, 2012 at 4:18 AM, Bruce Evans b...@optusnet.com.au wrote:



I would have tried a 32 bit format with a variable named 'ticks'.
Something like:
- ticks = 0.  Same meaning as now.  No changes in ABIs or APIs to use
  this.  The tick period would be constant but for virtual ticks and
  not too small.  hz = 1000 now makes the period too small, and not a
  power of 2.  So make the period 1/128 second.  This gives a 1.24.7
  binary format.  2**24 seconds is 194 days.
- ticks  0.  The 31 value bits are now a cookie (descriptor) referring
  to a bintime or whatever.  This case should rarely be used.  I don't
  like it that a tickless kernel, which is needed mainly for power
  saving, has expanded into complications to support short timeouts
  which should rarely be used.


Bruce, I don't really agree with this.
The data addressed by cookie should be still stored somewhere, and KBI
will result broken. This, indeed, is not real problem as long as
current calloutng code heavily breaks KBI, but if that was your point,
I don't see how your proposed change could help.


In the old API, it is an error to pass ticks  0, so only broken old
callers are affected.  Of course, if there are any then it would be
hard to detect their garbage cookies.

Anywy, it's too later to change to this, and maybe also to a 32.32
format.

[32.32 format]

This would make a better general format than timevals, timespecs and
of course bintimes :-).  It is a bit wasteful for timeouts since
its extremes are rarely used.  Malicious and broken callers can
still cause overflow at 68 years, so you have to check for it and
handle it.  The limit of 194 days is just as good for timeouts.


I think the phk's proposal  is better. About your overflow objection,
I think is really unlikely to happen, but better safe than sorry.


It's very easy for applications to cause kernel overflow using valid
syscall args like tv_sec = TIME_T_MAX for a relative time in
nanosleep().  Adding TIME_T_MAX to the current time in seconds overflow
for all current times except for the first second after the Epoch.
There is no difference between the overflow for 32-bit and 64-bit
time_t's for this.  This is now mostly handled so that the behaviour is
harmless although wrong.  E.g., the timeout might become negative,
and then since it is not a cookie it is silently replaced by a timeout
of 1 tick.  In nanosleep(), IIRC there are further overflows that result
in returning early instead of retrying the 1-tick timeouts endlessly.

Bruce
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Bruce Evans

On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:



In message 20121220005706.i1...@besplex.bde.org, Bruce Evans writes:

On Wed, 19 Dec 2012, Poul-Henning Kamp wrote:



Except that for absolute timescales, we're running out of the 32 bits
integer part.


Except 32 bit time_t works until 2106 if it is unsigned.


That's sort of not an option.


I think it is.  It is just probably not necessary since 32-bit systems
will go away before 2038.


The real problem was that time_t was not defined as a floating
point number.


That would be convenient too, but bad for efficiency on some systems.
Kernels might not be able to use it, and then would have to use an
alternative representation, which they should have done all along.


[1] A good addition to C would be a general multi-word integer type
where you could ask for any int%d_t or uint%d_t you cared for, and
have the compiler DTRT.  In difference from using a multiword-library,
this would still give these types their natural integer behaviour.


That would be convenient, but bad for efficiency if it were actually
used much.


You can say that about anything but CPU-native operations, and I doubt
it would be as inefficient as struct bintime, which does not have access
to the carry bit.


Yes, I would say that about non-native.  It goes against the spirit of C.

OTOH, compilers are getting closer to giving full access to the carry
bit.  I just checked what clang does in a home-made 128-bit add function:

% static void __noinline
% uadd(struct u *xup, struct u *yup)
% {
%   unsigned long long t;
% 
% 	t = xup-w[0] + yup-w[0];

%   if (t  xup-w[0])
%   xup-w[1]++;
%   xup-w[0] = t;
%   xup-w[1] += yup-w[1];
% }
% 
% 	.align	16, 0x90

%   .type   uadd,@function
% uadd:   # @uadd
%   .cfi_startproc
% # BB#0: # %entry
%   movq(%rdi), %rcx
%   movq8(%rdi), %rax
%   addq(%rsi), %rcx

gcc generates an additional cmpq instruction here.

%   jae .LBB2_2

clang uses the carry bit set by the first addition to avoid the comparison,
but still branches.

% # BB#1: # %if.then
%   incq%rax
%   movq%rax, 8(%rdi)

This adds 1 explicitly instead of using adcq, but this is the slow path.

% .LBB2_2:# %if.end
%   movq%rcx, (%rdi)
%   addq8(%rsi), %rax

This is as efficient as possible except for the extra branch, and the
branch is almost perfectly predictable.

%   movq%rax, 8(%rdi)
%   ret
% .Ltmp22:
%   .size   uadd, .Ltmp22-uadd
%   .cfi_endproc

Bruce
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Poul-Henning Kamp

In message 50d1e0d8.9070...@freebsd.org, Alexander Motin writes:

It would be late to change this after committing. I would definitely 
like it to be done earlier to not redo all the tests, but I think we 
could convert callout and eventtimers code to 32.32 format in several 
days. The only two questions are: do we really want it (won't there be 
any reasons to regret about it) and how do we want it to look?

As much as it pains me to raise this point, we would regret it
if we did not use 32.32, because Linux already went that way.

As much as there is to be said for doing things right, we should
also try to avoid pointless incompatibilities which will make it
needlessly hard for people to move code, particular device drivers
forth and back.


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-19 Thread Bruce Evans

I finally remembered to remove the .it phk :-).

On Wed, 19 Dec 2012, Luigi Rizzo wrote:


On Wed, Dec 19, 2012 at 10:51:48AM +, Poul-Henning Kamp wrote:

...
As I said in my previous email:


typedef dur_t   int64_t;/* signed for bug catching */
#define DURSEC  ((dur_t)1  32)
#define DURMIN  (DURSEC * 60)
#define DURMSEC (DURSEC / 1000)
#define DURUSEC (DURSEC / 1000)
#define DURNSEC (DURSEC / 100)

(Bikeshed the names at your convenience)

Then you can say

callout_foo(34 * DURSEC)
callout_foo(2400 * DURMSEC)
or
callout_foo(500 * DURNSEC)


only thing, we must be careful with the parentheses

For instance, in your macro, DURNSEC evaluates to 0 and so
does any multiple of it.
We should define them as

#define DURNSEC DURSEC / 100
...

so DURNSEC is still 0 and 500*DURNSEC gives 214

I am curious that Bruce did not mention this :)


Er, he was careful.  DURNSEC gives 4, not 0.  This is not very accurate,
but probably good enough.

Your version without parentheses is not so careful and depends on
a magic order of operations and no overflow from this.  E.g.:

500*DURNSEC = 500*DURSEC / 10 = 500*((dur_t)1  32) / 10

This is very accurate and happens not to overflow.  But 5 seconds represented
a little strangely in nanoseconds would overflow:

50*DURNSEC = 50*((dur_t)1  32) / 10

So would 5 billion times DURSEC, but 5 billion seconds is more unreasobable
than 5 billion nanoseconds and the format just can't represent that.



(btw the typedef is swapped, should be typedef int64_t dur_t)


Didn't notice this.

Bruce
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Luigi Rizzo
On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:
 Hi.
 
  I would instead do the following:
 
 I also don't very like the wide API and want to hear fresh ideas, but 
 approaches to time measurement there are too different to do what you 
 are proposing.  Main problem is that while ticks value is relative, 
 bintime is absolute. It is not easy to make conversion between them fast 
 and precise. I've managed to do it, but the only function that does it 
 now is _callout_reset_on(). All other functions are just passing values 
 down. I am not sure I want to duplicate that code in each place, though 
 doing it at least for for callout may be a good idea.

I am afraid the above is not convincing.

Most/all of the APIs i mentioned still have the conversion from
ticks to bintime, and the code in your patch is just
building multiple parallel paths (one for each of the
three versions of the same function) to some final
piece of code where the conversion takes place.

The problem is that all of this goes through a set of obfuscating
macros and the end result is horrible.

To be clear, i believe the work you have been doing on cleaning up
callout is great, i am just saying that this is the time to look
at the code from a few steps away and clean up all those design
decisions that perhaps were made in a haste to make things work.

I will give you another example to show how convoluted
is the code now:

cv_timedwait() and cv_timedwait_sig() now have three
versions each (plain, bt, flags).

These six are remapped through macros to two functions, _cv_timedwait()
and _cv_timedwait_sig(),  with a possible bug (cv_timedwait_bt()
maps to _cv_timedwait_sig() )

These two _cv_timedwait*() take both ticks and bintimes,
and contain this sequence:

+   if (bt == NULL)
+   sleepq_set_timeout_flags(cvp, timo, flags);
+   else
+   sleepq_set_timeout_bt(cvp, bt, precision);

Guess what, both sleepq_* are macros that remap to the same
_sleepq_set_timeout(...) . So the above if (bt == NULL) is useless.

But then if you dig into _sleepq_set_timeout() you'll see

+   if (bt == NULL)
+   callout_reset_flags_on(td-td_slpcallout, timo,
+   sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
+   else
+   callout_reset_bt_on(td-td_slpcallout, bt, precision,
+   sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);

and again both callout_reset*() are again remapped through
macros to _callout_reset_on(), so another useless if (bt == NULL)
And in the end you have the conversion from ticks to bintime.

So basically the code path for cv_timedwait() has those
two useless switches and one useless extra argument,
and the conversion from ticks to bintime is down
deep down in _callout_reset_on() where it can only
be resolved at runtime, whereas by doing the conversion
at the beginning the decision could have been made at compile
time.

So I believe my proposal would give large simplifications in
the code and lead to a much cleaner implementation of what
you have designed:

1. acknowledge the fact that the only representation of time
   that callouts use internally is a bintime+precision, define one
   single function (instead of two or three or six) that implements
   the blessed API, and implement the others with macros or
   inline functions doing the appropriate conversions;

2. specifically, the *_flags() variant has no reason to exist.
   It can be implemented through the *_bt() variant, and
   being a new function the only places where you introduce it
   require manual modifications so you can directly invoke
   the new function.

Again, please take this as constructive criticism, as i really
like the work you have been doing and appreciate the time and
effort you are putting on it

cheers
luigi

 
 Creating sets of three functions I had three different goals:
  - callout_reset() -- it is legacy variant required to keep API 
 compatibility;
  - callout_reset_flags() -- it is for cases where custom precision 
 specification needs to be added to the existing code, or where direct 
 callout execution is needed. Conversion to bintime would additionally 
 complicate consumer code, that I would try to avoid.
  - callout_reset_bt() -- API for the new code, which needs high 
 precision and doesn't mind to operate bintime. Now there is only three 
 such places in kernel now, and I don't think there will be much more.
 
 Respectively, these three options are replicated to other APIs where 
 time intervals are used.
 
 PS: Please keep me in CC.
 
 -- 
 Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Luigi Rizzo
On Tue, Dec 18, 2012 at 06:36:43PM +0100, Luigi Rizzo wrote:
 On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:
...
 So I believe my proposal would give large simplifications in
 the code and lead to a much cleaner implementation of what
 you have designed:
 
 1. acknowledge the fact that the only representation of time
that callouts use internally is a bintime+precision, define one
single function (instead of two or three or six) that implements
the blessed API, and implement the others with macros or
inline functions doing the appropriate conversions;
 
 2. specifically, the *_flags() variant has no reason to exist.
It can be implemented through the *_bt() variant, and
being a new function the only places where you introduce it
require manual modifications so you can directly invoke
the new function.

to clarify: i am not sure if now the *_bt() variant takes flags too,
but my point is that the main API function should take all
supported arguments (including flags) and others should
simply be regarded as simplified versions. More or less what
we have for sockets, with send() and sendmsg() and friend

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Alexander Motin

On 18.12.2012 19:36, Luigi Rizzo wrote:

On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:

I would instead do the following:


I also don't very like the wide API and want to hear fresh ideas, but
approaches to time measurement there are too different to do what you
are proposing.  Main problem is that while ticks value is relative,
bintime is absolute. It is not easy to make conversion between them fast
and precise. I've managed to do it, but the only function that does it
now is _callout_reset_on(). All other functions are just passing values
down. I am not sure I want to duplicate that code in each place, though
doing it at least for for callout may be a good idea.


I am afraid the above is not convincing.

Most/all of the APIs i mentioned still have the conversion from
ticks to bintime, and the code in your patch is just
building multiple parallel paths (one for each of the
three versions of the same function) to some final
piece of code where the conversion takes place.

The problem is that all of this goes through a set of obfuscating
macros and the end result is horrible.

To be clear, i believe the work you have been doing on cleaning up
callout is great, i am just saying that this is the time to look
at the code from a few steps away and clean up all those design
decisions that perhaps were made in a haste to make things work.

I will give you another example to show how convoluted
is the code now:

cv_timedwait() and cv_timedwait_sig() now have three
versions each (plain, bt, flags).

These six are remapped through macros to two functions, _cv_timedwait()
and _cv_timedwait_sig(),  with a possible bug (cv_timedwait_bt()
maps to _cv_timedwait_sig() )

These two _cv_timedwait*() take both ticks and bintimes,
and contain this sequence:

+   if (bt == NULL)
+   sleepq_set_timeout_flags(cvp, timo, flags);
+   else
+   sleepq_set_timeout_bt(cvp, bt, precision);

Guess what, both sleepq_* are macros that remap to the same
_sleepq_set_timeout(...) . So the above if (bt == NULL) is useless.

But then if you dig into _sleepq_set_timeout() you'll see

+   if (bt == NULL)
+   callout_reset_flags_on(td-td_slpcallout, timo,
+   sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
+   else
+   callout_reset_bt_on(td-td_slpcallout, bt, precision,
+   sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);

and again both callout_reset*() are again remapped through
macros to _callout_reset_on(), so another useless if (bt == NULL)
And in the end you have the conversion from ticks to bintime.

So basically the code path for cv_timedwait() has those
two useless switches and one useless extra argument,
and the conversion from ticks to bintime is down
deep down in _callout_reset_on() where it can only
be resolved at runtime, whereas by doing the conversion
at the beginning the decision could have been made at compile
time.

So I believe my proposal would give large simplifications in
the code and lead to a much cleaner implementation of what
you have designed:

1. acknowledge the fact that the only representation of time
that callouts use internally is a bintime+precision, define one
single function (instead of two or three or six) that implements
the blessed API, and implement the others with macros or
inline functions doing the appropriate conversions;

2. specifically, the *_flags() variant has no reason to exist.
It can be implemented through the *_bt() variant, and
being a new function the only places where you introduce it
require manual modifications so you can directly invoke
the new function.

Again, please take this as constructive criticism, as i really
like the work you have been doing and appreciate the time and
effort you are putting on it


Your words about useless cascaded ifs touched me. Actually I've looked 
on _callout_reset_bt_on() yesterday, thinking about moving tick to bt 
conversion to separate function or wrapper. The only thing we would save 
in such case is single integer argument (ticks), as all others (bt, 
prec, flags) are used in the new world order. From the other side, to 
make the conversion process really effective and correct, I've used 
quite specific way of obtaining time, that may change in the future. I 
would not really like it to be inlined in every consumer function and 
become an ABI. So I see two possible ways: make that conversion a 
separate non-inline function (that will require two temporary variables 
to return results and will consume some time on call/return), or make 
callout_reset_bt_on() to have extra ticks argument, allowing to use it 
in one or another way without external ifs and macros. In last case all 
_bt functions in other APIs will also obtain ticks, bt, pr and flags 
arguments. Actually flags there could be used to specify time scale 
(monotonic or wall) and time base (relative or absolute), if we 

Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Alexander Motin

On 18.12.2012 20:03, Alexander Motin wrote:

On 18.12.2012 19:36, Luigi Rizzo wrote:

On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:

I would instead do the following:


I also don't very like the wide API and want to hear fresh ideas, but
approaches to time measurement there are too different to do what you
are proposing.  Main problem is that while ticks value is relative,
bintime is absolute. It is not easy to make conversion between them fast
and precise. I've managed to do it, but the only function that does it
now is _callout_reset_on(). All other functions are just passing values
down. I am not sure I want to duplicate that code in each place, though
doing it at least for for callout may be a good idea.


I am afraid the above is not convincing.

Most/all of the APIs i mentioned still have the conversion from
ticks to bintime, and the code in your patch is just
building multiple parallel paths (one for each of the
three versions of the same function) to some final
piece of code where the conversion takes place.

The problem is that all of this goes through a set of obfuscating
macros and the end result is horrible.

To be clear, i believe the work you have been doing on cleaning up
callout is great, i am just saying that this is the time to look
at the code from a few steps away and clean up all those design
decisions that perhaps were made in a haste to make things work.

I will give you another example to show how convoluted
is the code now:

cv_timedwait() and cv_timedwait_sig() now have three
versions each (plain, bt, flags).

These six are remapped through macros to two functions, _cv_timedwait()
and _cv_timedwait_sig(),  with a possible bug (cv_timedwait_bt()
maps to _cv_timedwait_sig() )

These two _cv_timedwait*() take both ticks and bintimes,
and contain this sequence:

+if (bt == NULL)
+sleepq_set_timeout_flags(cvp, timo, flags);
+else
+sleepq_set_timeout_bt(cvp, bt, precision);

Guess what, both sleepq_* are macros that remap to the same
_sleepq_set_timeout(...) . So the above if (bt == NULL) is useless.

But then if you dig into _sleepq_set_timeout() you'll see

+if (bt == NULL)
+callout_reset_flags_on(td-td_slpcallout, timo,
+sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
+else
+callout_reset_bt_on(td-td_slpcallout, bt, precision,
+sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);

and again both callout_reset*() are again remapped through
macros to _callout_reset_on(), so another useless if (bt == NULL)
And in the end you have the conversion from ticks to bintime.

So basically the code path for cv_timedwait() has those
two useless switches and one useless extra argument,
and the conversion from ticks to bintime is down
deep down in _callout_reset_on() where it can only
be resolved at runtime, whereas by doing the conversion
at the beginning the decision could have been made at compile
time.

So I believe my proposal would give large simplifications in
the code and lead to a much cleaner implementation of what
you have designed:

1. acknowledge the fact that the only representation of time
that callouts use internally is a bintime+precision, define one
single function (instead of two or three or six) that implements
the blessed API, and implement the others with macros or
inline functions doing the appropriate conversions;

2. specifically, the *_flags() variant has no reason to exist.
It can be implemented through the *_bt() variant, and
being a new function the only places where you introduce it
require manual modifications so you can directly invoke
the new function.

Again, please take this as constructive criticism, as i really
like the work you have been doing and appreciate the time and
effort you are putting on it


Your words about useless cascaded ifs touched me. Actually I've looked
on _callout_reset_bt_on() yesterday, thinking about moving tick to bt
conversion to separate function or wrapper. The only thing we would save
in such case is single integer argument (ticks), as all others (bt,
prec, flags) are used in the new world order. From the other side, to
make the conversion process really effective and correct, I've used
quite specific way of obtaining time, that may change in the future. I
would not really like it to be inlined in every consumer function and
become an ABI. So I see two possible ways: make that conversion a
separate non-inline function (that will require two temporary variables
to return results and will consume some time on call/return), or make
callout_reset_bt_on() to have extra ticks argument, allowing to use it
in one or another way without external ifs and macros. In last case all
_bt functions in other APIs will also obtain ticks, bt, pr and flags
arguments. Actually flags there could be used to specify time scale
(monotonic or wall) and time base (relative or absolute), if we decide
to implement all 

Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Luigi Rizzo
[top posting for readability;
in summary we were discussing the new callout API trying to avoid
an explosion of methods and arguments while at the same time
supporting the old API and the new one]
(I am also Cc-ing phk as he might have better insight
on the topic).

I think the patch you propose is a step in the right direction,
but i still remain concerned by having to pass two bintimes
(by reference, but they should really go by value)
and one 'ticks' value to all these functions.

I am also dubious that we need a full 128 bits to specify
the 'precision': there would be absolutely no loss of functionality
if we decided to specify the precision in powers of 2, so a precision
'k' (signed) means 2^k seconds. This way 8 bits are enough to
represent any precision we want.

The key difference between 'ticks' and bintimes (and the main
difficulty in the conversion) is that ticks are relative and bintimes
are interpreted as absolute. This could be easily solved by using
a flag to specify if the 'bt' argument is absolute or relative, and
passing the argument by value.

So now the flags could contain C_DIRECT_EXEC, C_BT_IS_RELATIVE, the
precision, and another 64 or 128 bit field contains the bintime.

How does this look ?

cheers
luigi

 On 18.12.2012 20:03, Alexander Motin wrote:
 On 18.12.2012 19:36, Luigi Rizzo wrote:
 On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:
 I would instead do the following:
 
 I also don't very like the wide API and want to hear fresh ideas, but
 approaches to time measurement there are too different to do what you
 are proposing.  Main problem is that while ticks value is relative,
 bintime is absolute. It is not easy to make conversion between them fast
 and precise. I've managed to do it, but the only function that does it
 now is _callout_reset_on(). All other functions are just passing values
 down. I am not sure I want to duplicate that code in each place, though
 doing it at least for for callout may be a good idea.
 
 I am afraid the above is not convincing.
 
 Most/all of the APIs i mentioned still have the conversion from
 ticks to bintime, and the code in your patch is just
 building multiple parallel paths (one for each of the
 three versions of the same function) to some final
 piece of code where the conversion takes place.
 
 The problem is that all of this goes through a set of obfuscating
 macros and the end result is horrible.
 
 To be clear, i believe the work you have been doing on cleaning up
 callout is great, i am just saying that this is the time to look
 at the code from a few steps away and clean up all those design
 decisions that perhaps were made in a haste to make things work.
 
 I will give you another example to show how convoluted
 is the code now:
 
 cv_timedwait() and cv_timedwait_sig() now have three
 versions each (plain, bt, flags).
 
 These six are remapped through macros to two functions, _cv_timedwait()
 and _cv_timedwait_sig(),  with a possible bug (cv_timedwait_bt()
 maps to _cv_timedwait_sig() )
 
 These two _cv_timedwait*() take both ticks and bintimes,
 and contain this sequence:
 
 +if (bt == NULL)
 +sleepq_set_timeout_flags(cvp, timo, flags);
 +else
 +sleepq_set_timeout_bt(cvp, bt, precision);
 
 Guess what, both sleepq_* are macros that remap to the same
 _sleepq_set_timeout(...) . So the above if (bt == NULL) is useless.
 
 But then if you dig into _sleepq_set_timeout() you'll see
 
 +if (bt == NULL)
 +callout_reset_flags_on(td-td_slpcallout, timo,
 +sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
 +else
 +callout_reset_bt_on(td-td_slpcallout, bt, precision,
 +sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
 
 and again both callout_reset*() are again remapped through
 macros to _callout_reset_on(), so another useless if (bt == NULL)
 And in the end you have the conversion from ticks to bintime.
 
 So basically the code path for cv_timedwait() has those
 two useless switches and one useless extra argument,
 and the conversion from ticks to bintime is down
 deep down in _callout_reset_on() where it can only
 be resolved at runtime, whereas by doing the conversion
 at the beginning the decision could have been made at compile
 time.
 
 So I believe my proposal would give large simplifications in
 the code and lead to a much cleaner implementation of what
 you have designed:
 
 1. acknowledge the fact that the only representation of time
 that callouts use internally is a bintime+precision, define one
 single function (instead of two or three or six) that implements
 the blessed API, and implement the others with macros or
 inline functions doing the appropriate conversions;
 
 2. specifically, the *_flags() variant has no reason to exist.
 It can be implemented through the *_bt() variant, and
 being a new function the only places where you introduce it
 require manual modifications so you can 

Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Ian Lepore
On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:
 [top posting for readability;
 in summary we were discussing the new callout API trying to avoid
 an explosion of methods and arguments while at the same time
 supporting the old API and the new one]
 (I am also Cc-ing phk as he might have better insight
 on the topic).
 
 I think the patch you propose is a step in the right direction,
 but i still remain concerned by having to pass two bintimes
 (by reference, but they should really go by value)
 and one 'ticks' value to all these functions.
 
 I am also dubious that we need a full 128 bits to specify
 the 'precision': there would be absolutely no loss of functionality
 if we decided to specify the precision in powers of 2, so a precision
 'k' (signed) means 2^k seconds. This way 8 bits are enough to
 represent any precision we want.
 
 The key difference between 'ticks' and bintimes (and the main
 difficulty in the conversion) is that ticks are relative and bintimes
 are interpreted as absolute. This could be easily solved by using
 a flag to specify if the 'bt' argument is absolute or relative, and
 passing the argument by value.
 
 So now the flags could contain C_DIRECT_EXEC, C_BT_IS_RELATIVE, the
 precision, and another 64 or 128 bit field contains the bintime.
 
 How does this look ?
 
 cheers
 luigi
 

I tend to agree that the bintime should be passed by value instead of
reference.  That would allow an inline tickstobintime() that converts
relative ticks to an absolute bintime returned by value and passed right
along in one tidy line/clump of code without any temporary variables
cluttering things up.  

While the 1980s C programmer in me still wants to avoid returning
complex objects by value, the reality is that modern compilers tend to
generate really nice code for such constructs, usually without any
copying of the return value at all.

I'm not so sure about the 2^k precision.  You speak of seconds, but I
would be worrying about sub-second precision in my work.  It would
typical to want a 500uS timeout but be willing to late by up to 250uS if
that helped scheduling and performance.  Also, my idea of precision
would virtually always be I'm willing to be late by this much, but
never early by any amount.

To reinforce the point of that last paragraph... the way I'm looking at
these changes has nothing to do with power saving (I've never owned a
battery-operated computer, probably never will) and everything to do
with performance and being able to sleep accurately for less than a
tick.

-- Ian

  On 18.12.2012 20:03, Alexander Motin wrote:
  On 18.12.2012 19:36, Luigi Rizzo wrote:
  On Mon, Dec 17, 2012 at 11:03:53PM +0200, Alexander Motin wrote:
  I would instead do the following:
  
  I also don't very like the wide API and want to hear fresh ideas, but
  approaches to time measurement there are too different to do what you
  are proposing.  Main problem is that while ticks value is relative,
  bintime is absolute. It is not easy to make conversion between them fast
  and precise. I've managed to do it, but the only function that does it
  now is _callout_reset_on(). All other functions are just passing values
  down. I am not sure I want to duplicate that code in each place, though
  doing it at least for for callout may be a good idea.
  
  I am afraid the above is not convincing.
  
  Most/all of the APIs i mentioned still have the conversion from
  ticks to bintime, and the code in your patch is just
  building multiple parallel paths (one for each of the
  three versions of the same function) to some final
  piece of code where the conversion takes place.
  
  The problem is that all of this goes through a set of obfuscating
  macros and the end result is horrible.
  
  To be clear, i believe the work you have been doing on cleaning up
  callout is great, i am just saying that this is the time to look
  at the code from a few steps away and clean up all those design
  decisions that perhaps were made in a haste to make things work.
  
  I will give you another example to show how convoluted
  is the code now:
  
  cv_timedwait() and cv_timedwait_sig() now have three
  versions each (plain, bt, flags).
  
  These six are remapped through macros to two functions, _cv_timedwait()
  and _cv_timedwait_sig(),  with a possible bug (cv_timedwait_bt()
  maps to _cv_timedwait_sig() )
  
  These two _cv_timedwait*() take both ticks and bintimes,
  and contain this sequence:
  
  +if (bt == NULL)
  +sleepq_set_timeout_flags(cvp, timo, flags);
  +else
  +sleepq_set_timeout_bt(cvp, bt, precision);
  
  Guess what, both sleepq_* are macros that remap to the same
  _sleepq_set_timeout(...) . So the above if (bt == NULL) is useless.
  
  But then if you dig into _sleepq_set_timeout() you'll see
  
  +if (bt == NULL)
  +callout_reset_flags_on(td-td_slpcallout, timo,
  +sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
  +else
  +

Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Luigi Rizzo
On Tue, Dec 18, 2012 at 04:27:45PM -0700, Ian Lepore wrote:
 On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:
  [top posting for readability;
  in summary we were discussing the new callout API trying to avoid
  an explosion of methods and arguments while at the same time
  supporting the old API and the new one]
  (I am also Cc-ing phk as he might have better insight
  on the topic).
  
  I think the patch you propose is a step in the right direction,
  but i still remain concerned by having to pass two bintimes
  (by reference, but they should really go by value)
  and one 'ticks' value to all these functions.
  
  I am also dubious that we need a full 128 bits to specify
  the 'precision': there would be absolutely no loss of functionality
  if we decided to specify the precision in powers of 2, so a precision
  'k' (signed) means 2^k seconds. This way 8 bits are enough to
  represent any precision we want.

...
 I'm not so sure about the 2^k precision.  You speak of seconds, but I
 would be worrying about sub-second precision in my work.  It would
 typical to want a 500uS timeout but be willing to late by up to 250uS if

i said k is signed so negative values represent fractions of a
second. 2^-128 is pretty short :)

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Ian Lepore
On Wed, 2012-12-19 at 00:29 +0100, Luigi Rizzo wrote:
 On Tue, Dec 18, 2012 at 04:27:45PM -0700, Ian Lepore wrote:
  On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:
   [top posting for readability;
   in summary we were discussing the new callout API trying to avoid
   an explosion of methods and arguments while at the same time
   supporting the old API and the new one]
   (I am also Cc-ing phk as he might have better insight
   on the topic).
   
   I think the patch you propose is a step in the right direction,
   but i still remain concerned by having to pass two bintimes
   (by reference, but they should really go by value)
   and one 'ticks' value to all these functions.
   
   I am also dubious that we need a full 128 bits to specify
   the 'precision': there would be absolutely no loss of functionality
   if we decided to specify the precision in powers of 2, so a precision
   'k' (signed) means 2^k seconds. This way 8 bits are enough to
   represent any precision we want.
 
 ...
  I'm not so sure about the 2^k precision.  You speak of seconds, but I
  would be worrying about sub-second precision in my work.  It would
  typical to want a 500uS timeout but be willing to late by up to 250uS if
 
 i said k is signed so negative values represent fractions of a
 second. 2^-128 is pretty short :)
 
 cheers
 luigi

Ahh, I missed that.  Good enough then!  Hmmm, if that ideas survives
further review, then could precision be encoded in 8 bits of the flags,
eliminating another parm?

-- Ian


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-18 Thread Luigi Rizzo
On Tue, Dec 18, 2012 at 04:37:10PM -0700, Ian Lepore wrote:
 On Wed, 2012-12-19 at 00:29 +0100, Luigi Rizzo wrote:
  On Tue, Dec 18, 2012 at 04:27:45PM -0700, Ian Lepore wrote:
   On Tue, 2012-12-18 at 23:58 +0100, Luigi Rizzo wrote:
[top posting for readability;
in summary we were discussing the new callout API trying to avoid
an explosion of methods and arguments while at the same time
supporting the old API and the new one]
(I am also Cc-ing phk as he might have better insight
on the topic).

I think the patch you propose is a step in the right direction,
but i still remain concerned by having to pass two bintimes
(by reference, but they should really go by value)
and one 'ticks' value to all these functions.

I am also dubious that we need a full 128 bits to specify
the 'precision': there would be absolutely no loss of functionality
if we decided to specify the precision in powers of 2, so a precision
'k' (signed) means 2^k seconds. This way 8 bits are enough to
represent any precision we want.
  
  ...
   I'm not so sure about the 2^k precision.  You speak of seconds, but I
   would be worrying about sub-second precision in my work.  It would
   typical to want a 500uS timeout but be willing to late by up to 250uS if
  
  i said k is signed so negative values represent fractions of a
  second. 2^-128 is pretty short :)
  
  cheers
  luigi
 
 Ahh, I missed that.  Good enough then!  Hmmm, if that ideas survives
 further review, then could precision be encoded in 8 bits of the flags,
 eliminating another parm?

that was also what i wrote later in the message :)

now we should figure out some use for the remaining 22 bits of the flags

cheers
luigi

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


API explosion (Re: [RFC/RFT] calloutng)

2012-12-17 Thread Luigi Rizzo
[again, response to another issue i raised]

On Fri, Dec 14, 2012 at 01:57:36PM +0100, Davide Italiano wrote:
 On Fri, Dec 14, 2012 at 7:41 AM, Luigi Rizzo ri...@iet.unipi.it wrote:
...
  Finally, a more substantial comment:
  - a lot of functions which formerly had only a timo argument
now have timo, bt, precision, flags. Take seltdwait() as an example.
 
 
 seltdwait() is not part of the public KPI. It has been modified to
 avoid code duplication. Having seltdwait() and seltdwait_bt(), i.e.
 two separate functions, even though we could share most of the code is
 not a clever approach, IMHO.
 As I told before, seltdwait() is not exposed so we can modify its
 argument without breaking anything.
 
It seems that you have been undecided between two approaches:
for some of these functions you have preserved the original function
that deals with ticks and introduced a new one that deals with the
  bintime,
whereas in other cases you have modified the original function to add
bt, precision, flags.
 
 
 I'm not. All the functions which are part of the public KPI (e.g.
 condvar(9), sleepq(9), *) are still available.  *_flags variants have
 been introduced so that consumers can take advantage of the new
 'precision tolerance mechanism' implemented. Also, *_bt variants have
 been introduced. I don't see any undecision between the two
 approaches.
 Please note that now the callout backend deals with bintime, so every
 time callout_reset_on() is called, the 'tick' argument passed is
 silently converted to bintime.

I will try to give more specific example, using the latest patch
from mav

http://people.freebsd.org/~mav/calloutng_12_16.patch

In the manpage, for instance, the existing functions
now are extended with two more variants (sometimes;
msleep_spin() for instance is missing msleep_spin_bt()
but perhaps that is just an oversight).

 .Nm sleepq_set_timeout ,
+.Nm sleepq_set_timeout_flags ,
+.Nm sleepq_set_timeout_bt ,

 .Nm msleep ,
+.Nm msleep_flags ,
+.Nm msleep_bt ,
 .Nm msleep_spin ,
+.Nm msleep_spin_flags ,
 .Nm pause ,
+.Nm pause_flags ,
+.Nm pause_bt ,
 .Nm tsleep ,
+.Nm tsleep_flags ,
+.Nm tsleep_bt ,

 .Nm cv_timedwait ,
+.Nm cv_timedwait_bt ,
+.Nm cv_timedwait_flags ,
 .Nm cv_timedwait_sig ,
+.Nm cv_timedwait_sig_bt ,
+.Nm cv_timedwait_sig_flags ,

 .Nm callout_reset ,
+.Nm callout_reset_flags ,
 .Nm callout_reset_on ,
+.Nm callout_reset_flags_on ,
+.Nm callout_reset_bt_on ,

If you look at the backends, they take both a timo and a bintime.

-int_cv_timedwait(struct cv *cvp, struct lock_object *lock, int 
timo);
-int_cv_timedwait_sig(struct cv *cvp, struct lock_object *lock, int 
timo);
+int_cv_timedwait(struct cv *cvp, struct lock_object *lock,
+   struct bintime *bt, struct bintime *precision, int timo,
+   int flags);
+int_cv_timedwait_sig(struct cv *cvp, struct lock_object *lock,
+   struct bintime *bt, struct bintime *precision, int timo,
+   int flags);

and then internally they call the 'timo' or the 'bt' version
depending on the case

+   if (bt == NULL)
+   sleepq_set_timeout_flags(cvp, timo, flags);
+   else
+   sleepq_set_timeout_bt(cvp, bt, precision);

So basically you are doing the following:
   + create two new variant for each existing function

foo(, ... timo, ... )   old method
foo_flags(, ... timo, ... ) new method
foo_bt(... , bt, precision, ...)new method

 (the 'API explosion' i am mentioning in the Subject)

   + the variants are mapped to the same internal function
_foo_internal(..., timo, bt, precision, flags, ...)

   + and then the internal function has a (runtime) conditional

if (bt == NULL) {
// convert timo to bt
} else {
// have a bt + precision
}
...

I would instead do the following:

+ create a new API function that takes only bintime+precision+flags, no ticks.
  I am not sure if there is even a need to have an internal name
_cv_timedwait_sig(  )
  or you can just have
cv_timedwait_sig_bt(...)

+ use a macro or an inline function to remap the old API to
  the (single) new one, making the argument conversion immediately.
#define cv_timedwait_sig(...)   cv_timedwait_sig_bt( ...)

  This has the advantage that conversions might be done at compile
  time perhaps with some advantage in terms of code and performance.

+ do not bother creating yet another cv_timedwait_sig_flags() function.
  Since it did not exist before, you have to do the conversion manually
  anyways, at which point you just change the argument to use bintime
  instead of ticks.


Re: API explosion (Re: [RFC/RFT] calloutng)

2012-12-17 Thread Alexander Motin

Hi.

 I would instead do the following:

I also don't very like the wide API and want to hear fresh ideas, but 
approaches to time measurement there are too different to do what you 
are proposing.  Main problem is that while ticks value is relative, 
bintime is absolute. It is not easy to make conversion between them fast 
and precise. I've managed to do it, but the only function that does it 
now is _callout_reset_on(). All other functions are just passing values 
down. I am not sure I want to duplicate that code in each place, though 
doing it at least for for callout may be a good idea.


Creating sets of three functions I had three different goals:
 - callout_reset() -- it is legacy variant required to keep API 
compatibility;
 - callout_reset_flags() -- it is for cases where custom precision 
specification needs to be added to the existing code, or where direct 
callout execution is needed. Conversion to bintime would additionally 
complicate consumer code, that I would try to avoid.
 - callout_reset_bt() -- API for the new code, which needs high 
precision and doesn't mind to operate bintime. Now there is only three 
such places in kernel now, and I don't think there will be much more.


Respectively, these three options are replicated to other APIs where 
time intervals are used.


PS: Please keep me in CC.

--
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org