Re: Introduction of long term scheduling

2007-01-06 Thread Tony Finch
On Sat, 6 Jan 2007, M. Warner Losh wrote:

 OSes usually deal with timestamps all the time for various things.  To
 find out how much CPU to bill a process, to more mondane things.
 Having to do all these gymnastics is going to hurt performance.

That's why leap second handling should be done in userland as part of the
conversion from clock (scalar) time to civil (broken-down) time.

Tony.
--
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
SOUTHEAST ICELAND: SOUTHWEST BECOMING CYCLONIC 5 TO 7, PERHAPS GALE 8 LATER.
ROUGH TO HIGH. SQUALLY SHOWERS. MAINLY GOOD.


Re: Introduction of long term scheduling

2007-01-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Tony Finch [EMAIL PROTECTED] writes:
: On Sat, 6 Jan 2007, M. Warner Losh wrote:
: 
:  OSes usually deal with timestamps all the time for various things.  To
:  find out how much CPU to bill a process, to more mondane things.
:  Having to do all these gymnastics is going to hurt performance.
:
: That's why leap second handling should be done in userland as part of the
: conversion from clock (scalar) time to civil (broken-down) time.

Right.  And that's what makes things hard because the kernel time
clock needs to be monotonic, and leap seconds break that rule if one
does things in UTC such that the naive math just works (aka POSIX
time_t).  Some systems punt on keeping posix time internally, but have
complications for getting leapseconds right for times they return to
userland

Warner


Re: Introduction of long term scheduling

2007-01-06 Thread Rob Seaman

Warner Losh wrote:


leap seconds break that rule if one does things in UTC such that
the naive math just works


All civil timekeeping, and most precision timekeeping, requires only
pretty naive math.  Whatever the problem is - or is not - with leap
seconds, it isn't the arithmetic involved.  Take a look a [EMAIL PROTECTED]
and other BOINC projects.  Modern computers have firepower to burn in
fluff like live 3-D screensavers.  POSIX time handling just sucks for
no good reason.  Other system interfaces successfully implement
significantly more stringent facilities.

Expecting to be able to naively subtract timestamps to compute an
accurate interval reminds me of expecting to be able to naively stuff
pointers into integer datatypes and have nothing ever go wrong.  A
strongly typed language might even overload the subtraction of UTC
typed variables with the correct time-of-day to interval
calculations.  But then, what should one expect the subtraction of
Earth orientation values to return but some sort of angle, not an
interval?

Rob


Re: Introduction of long term scheduling

2007-01-06 Thread Poul-Henning Kamp
In message [EMAIL PROTECTED], Tony Fin
ch writes:
On Sat, 6 Jan 2007, M. Warner Losh wrote:

 OSes usually deal with timestamps all the time for various things.  To
 find out how much CPU to bill a process, to more mondane things.
 Having to do all these gymnastics is going to hurt performance.

That's why leap second handling should be done in userland as part of the
conversion from clock (scalar) time to civil (broken-down) time.

I would agree with you in theory, but badly designed filesystems
like FAT store timestamps in encoded YMDHMS format, so the kernel
need to know the trick as well. (There are other examples, but not
as well known).

--
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


Re: Introduction of long term scheduling

2007-01-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Rob Seaman [EMAIL PROTECTED] writes:
: Warner Losh wrote:
:
:  leap seconds break that rule if one does things in UTC such that
:  the naive math just works
:
: All civil timekeeping, and most precision timekeeping, requires only
: pretty naive math.  Whatever the problem is - or is not - with leap
: seconds, it isn't the arithmetic involved.  Take a look a [EMAIL PROTECTED]
: and other BOINC projects.  Modern computers have firepower to burn in
: fluff like live 3-D screensavers.  POSIX time handling just sucks for
: no good reason.  Other system interfaces successfully implement
: significantly more stringent facilities.

But modern servers and routers don't.  Anything that makes the math
harder (more computationally expensive) can have huge effects on
performance in these areas.  That's because the math is done so often
that any little change causes big headaches.

: Expecting to be able to naively subtract timestamps to compute an
: accurate interval reminds me of expecting to be able to naively stuff
: pointers into integer datatypes and have nothing ever go wrong.

Well, the kernel doesn't expect to be able to do that.  Internally,
all the FreeBSD kernel does is time based on a monotonically
increasing second count since boot.  When time is returned, it is
adjusted to the right wall time.  The kernel only worries about leap
seconds when time is incremented, since the ntpd portion in the kernel
needs to return special things during the leap second.  If there were
no leapseconds, then even that computation could be eliminated.  One
might think that one could 'defer' this work to gettimeofday and
friends, but that turns out to not be possible (or at least it is much
more inefficient to do it there).

Since the interface to the kernel is time_t, there's really no chance
for the kernel to do anything smarter with leapseconds.  gettimeofday,
time and clock_gettime all return a time_t in different flavors.

In short, you are taking things out of context and drawing the wrong
conclusion about what is done.  It is these complications, which I've
had to deal with over the past 7 years, that have lead me to the
understanding of the complications.  Espeically the 'non-uniform radix
crap' that's in UTC.  It really does complicate things in a number of
places that you wouldn't think.  To dimissively suggest it is only a
problem when subtracting two numbers to get an interval time is to
completely misunderstand the complications that leapseconds introduce
into systems and the unexpected places where they pop up.  Really, it
is a lot more complicated than just the 'simple' case you've latched
onto.

: A
: strongly typed language might even overload the subtraction of UTC
: typed variables with the correct time-of-day to interval
: calculations.

Kernels aren't written in these languages.  To base one's arugments
about what the right type for time is that is predicated on these
langauges is a non-starter.

: But then, what should one expect the subtraction of
: Earth orientation values to return but some sort of angle, not an
: interval?

These are a specialized thing that kernels don't care about.

Warner


Re: Introduction of long term scheduling

2007-01-06 Thread Poul-Henning Kamp
In message [EMAIL PROTECTED], Rob Seaman writes:
Warner Losh wrote:

 leap seconds break that rule if one does things in UTC such that
 the naive math just works

POSIX time handling just sucks for no good reason.

I've said it before, and I'll say it again:

There are two problems:

1. We get too short notice about leap-seconds.

2. POSIX and other standards cannot invent their UTC timescales.

These two problems can be solved according to two plans:

A. Abolish leap seconds.

B. i) Issue leapseconds with at least twenty times longer notice.
   ii) Ammend POSIX and/or ISO-C
   iii) Ammend NTP
   iv) Ammend NTP
   v) Convince all operating system to adobt the new API
   vi) Fix all the bugs in their implementations
   vii) Fix up all the relevant application code
   viii) Fix all tacit the assumptions about time_t.

I will fully agree, that while taking the much easier approach of
plan A, will vindicate the potheads who wrote the time_t definition,
and thus deprive us of a very satisfactory intelectual reward of
striking their handiwork from the standards, it would cost only a
fraction of plan B.


Poul-Henning

--
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


Re: Introduction of long term scheduling

2007-01-06 Thread Steve Allen
On Sat 2007-01-06T19:36:19 +, Poul-Henning Kamp hath writ:
 There are two problems:

 1. We get too short notice about leap-seconds.

 2. POSIX and other standards cannot invent their UTC timescales.

This is not fair, for there is a more fundamental problem:

No two clocks can ever stay in agreement.

And the question that POSIX time_t does not answer is:

What do you want to do about that?

In some applications, especially the one for which it was designed,
there is nothing wrong with POSIX time_t.  POSIX is just fine to
describe a clock which is manually reset as necessary to stay within
tolerance.

There are now other applications.
For some of those POSIX cannot do the job -- with or without leap seconds.

Yes, there is a cost of doing time right, and leap seconds are not to
blame for that cost.  They are a wake up call from the state of denial.

--
Steve Allen [EMAIL PROTECTED]WGS-84 (GPS)
UCO/Lick ObservatoryNatural Sciences II, Room 165Lat  +36.99858
University of CaliforniaVoice: +1 831 459 3046   Lng -122.06014
Santa Cruz, CA 95064http://www.ucolick.org/~sla/ Hgt +250 m


Re: Introduction of long term scheduling

2007-01-06 Thread Poul-Henning Kamp
In message [EMAIL PROTECTED], Steve Allen writes:
On Sat 2007-01-06T19:36:19 +, Poul-Henning Kamp hath writ:
 There are two problems:

 1. We get too short notice about leap-seconds.

 2. POSIX and other standards cannot invent their UTC timescales.

This is not fair, for there is a more fundamental problem:

Yes, this is perfectly fair, this is all the problems there are.

And furthermore, the two plans I outlined represent the only
two kinds of plans there are for solving this.

They can be varied for various sundry and unsundry purposes, such
as the leap-hour fig-leaf and similar, but there are only
two classes of solutions.

No two clocks can ever stay in agreement.

This is not relevant.  It's not a matter of clock precision or
clock stability.  It's only a matter of how they count.

Yes, there is a cost of doing time right, and leap seconds are not to
blame for that cost.  They are a wake up call from the state of denial.

Now, it can be equally argued, that leap seconds implement a state
of denial with respect to a particular lump of rocks ability as
timekeeper, so I suggest we keep that part of the discussion closed
for now.

--
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


Re: Introduction of long term scheduling

2007-01-06 Thread Ashley Yakeley

On Jan 6, 2007, at 11:36, Poul-Henning Kamp wrote:


B. i) Issue leapseconds with at least twenty times longer
notice.


This plan might not be so good from a software engineering point of
view. Inevitably software authors would hard-code the known table,
and then the software would fail ten years later with the first
unexpected leap second.

At least with the present system, programmers are (more) forced to
face the reality of the unpredictability of the time-scale.

--
Ashley Yakeley


Re: Introduction of long term scheduling

2007-01-06 Thread Tony Finch
On Sat, 6 Jan 2007, Steve Allen wrote:

 No two clocks can ever stay in agreement.

I don't think that statement is useful. Most people have a concept of
accuracy within certain tolerances, dependent on the quality of the clock
and its discipline mechanisms. For most purposes a computer's clock can be
kept correct with more than enough accuracy, and certainly enough accuracy
that leap seconds are noticeable.

Tony.
--
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
HEBRIDES BAILEY FAIR ISLE FAEROES: SOUTHWEST 6 TO GALE 8. VERY ROUGH OR HIGH.
RAIN OR SHOWERS. MODERATE OR GOOD.


Re: Introduction of long term scheduling

2007-01-06 Thread Ashley Yakeley

On Jan 6, 2007, at 13:47, Poul-Henning Kamp wrote:


In message [EMAIL PROTECTED],
Ashley Yakeley
writes:

On Jan 6, 2007, at 11:36, Poul-Henning Kamp wrote:


B. i) Issue leapseconds with at least twenty times longer
notice.


This plan might not be so good from a software engineering point of
view. Inevitably software authors would hard-code the known table,
and then the software would fail ten years later with the first
unexpected leap second.


Ten years later is a heck of a log more acceptable than 7 months
later.


Not necessarily. After seven months, or even after two years, there's
a better chance that the product is still in active maintenance.
Better to find that particular bug early, if someone's been so
foolish as to hard-code a leap-second table. The bug here, by the
way, is not that one particular leap second table is wrong. It's the
assumption that any fixed table can ever be correct.

If you were to make that assumption in your code, then your product
would be defective if it's ever used ten years from now (under your
plan B). Programs in general tend to be used for awhile. Is any of
your software from 1996 or before still in use? I should hope so.

Under the present system, however, it's a lot more obvious that a
hard-coded leap second table is a bad idea.

--
Ashley Yakeley


Re: Introduction of long term scheduling

2007-01-06 Thread Poul-Henning Kamp
In message [EMAIL PROTECTED], Ashley Yakeley
writes:

Not necessarily. After seven months, or even after two years, there's
a better chance that the product is still in active maintenance.
Better to find that particular bug early, if someone's been so
foolish as to hard-code a leap-second table. The bug here, by the
way, is not that one particular leap second table is wrong. It's the
assumption that any fixed table can ever be correct.

So you think it is appropriate to demand that ever computer with a
clock should suffer biannual software upgrades if it is not connected
to a network where it can get NTP or similar service ?

I know people who will disagree with you:

Air traffic control
Train control
Hospitals

and the list goes on.

6 months is simply not an acceptable warning to get, end of story.

--
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


Re: Introduction of long term scheduling

2007-01-06 Thread Ashley Yakeley

On Jan 6, 2007, at 14:43, Poul-Henning Kamp wrote:


So you think it is appropriate to demand that ever computer with a
clock should suffer biannual software upgrades if it is not connected
to a network where it can get NTP or similar service ?


Since that's the consequence of hard-coding a leap-second table,
that's exactly what I'm not proposing. Instead, they should suffer
biannual updates to their leap-second table. Doing this is an
engineering problem, but a known one.

Under your plan B, however, we'd have plenty of software that just
wouldn't get upgraded at all, but would simply fail after ten years.
That strikes me as worse.


I know people who will disagree with you:


I don't think you're serious.


Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe


Don't forget  | one second off since 2018. :-)

--
Ashley Yakeley


Re: Introduction of long term scheduling

2007-01-06 Thread Ashley Yakeley

On Jan 6, 2007, at 16:18, M. Warner Losh wrote:


Unfortunately, the kernel has to have a notion of time stepping around
a leap-second if it implements ntp.  There's no way around that that
isn't horribly expensive or difficult to code.  The reasons for the
kernel's need to know have been enumerated elsewhere...


Presumably it only needs to know the next leap-second to do this, not
the whole known table?

--
Ashley Yakeley


Re: Introduction of long term scheduling

2007-01-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Ashley Yakeley [EMAIL PROTECTED] writes:
: On Jan 6, 2007, at 16:18, M. Warner Losh wrote:
:
:  Unfortunately, the kernel has to have a notion of time stepping around
:  a leap-second if it implements ntp.  There's no way around that that
:  isn't horribly expensive or difficult to code.  The reasons for the
:  kernel's need to know have been enumerated elsewhere...
:
: Presumably it only needs to know the next leap-second to do this, not
: the whole known table?

Yes.  ntpd or another agent tells it when leap seconds are coming.  It
doesn't need a table.  Then again, none of the broadcast time services
provide a table...

Warner


Re: Introduction of long term scheduling

2007-01-06 Thread Rob Seaman

Warner Losh wrote:


Anything that makes the math
harder (more computationally expensive) can have huge effects on
performance in these areas.  That's because the math is done so often
that any little change causes big headaches.


Every IP packet has a 1's complement checksum.  (That not all
switches handle these properly is a different issue.)  Calculating a
checksum is about as expensive (or more so) than subtracting
timestamps the right way.  I have a hard time believing that epoch-
interval conversions have to be performed more often than IP
packets are assembled.  One imagines (would love to be pointed to
actual literature regarding such issues) that most computer time
handling devolves to requirements for relative intervals and epochs,
not to stepping outside to any external clock at all.  Certainly the
hardware clocking of signals is an issue entirely separate from what
we've been discussing as timekeeping and traceability.  (And note
that astronomers face much more rigorous requirements in a number of
ways when clocking out their CCDs.)


Well, the kernel doesn't expect to be able to do that.  Internally,
all the FreeBSD kernel does is time based on a monotonically
increasing second count since boot.  When time is returned, it is
adjusted to the right wall time.


Well, no - the point is that only some limp attempt is made to adjust
to the right time.


The kernel only worries about leap
seconds when time is incremented, since the ntpd portion in the kernel
needs to return special things during the leap second.  If there were
no leapseconds, then even that computation could be eliminated.  One
might think that one could 'defer' this work to gettimeofday and
friends, but that turns out to not be possible (or at least it is much
more inefficient to do it there).


One might imagine that an interface could be devised that would only
carry the burden for a leap second when a leap second is actually
pending.  Then it could be handled like any other rare phenomenon
that has to be dealt with correctly - like context switching or
swapping.


Really, it is a lot more complicated than just the 'simple' case
you've latched onto.


Ditto for Earth orientation and its relation to civil timekeeping.
I'm happy to admit that getting it right at the CPU level is
complex.  Shouldn't we be focusing on that, rather than on
eviscerating mean solar time?  In general, either side here would
have a better chance of convincing the other if actual proposals,
planning, research, requirements, and so forth, were discussed.  The
only proposal on the table - and the only one I spend every single
message trying to shoot down - is the absolutely ridiculous leap hour
proposal.  We're not defending leap seconds per se - we're defending
mean solar time.

A proposal to actually address the intrinsic complications of
timekeeping is more likely to be received warmly than is a kludge or
partial workaround.  I suspect it would be a lot more fun, too.


Kernels aren't written in these languages.  To base one's arugments
about what the right type for time is that is predicated on these
langauges is a non-starter.


No, but the kernels can implement support for these types and the
applications can code to them in whatever language.  Again - there is
a hell of a lot more complicated stuff going on under the hood than
what would be required to implement a proper model of timekeeping.

Rob


Re: Introduction of long term scheduling

2007-01-06 Thread Tony Finch
On Sat, 6 Jan 2007, Ashley Yakeley wrote:

 Presumably it only needs to know the next leap-second to do this, not
 the whole known table?

Kernels sometimes need to deal with historical timestamps (principally
from the filesystem) so it'll need a full table to be able to convert
between POSIX time and atomic time for compatibility purposes.

Tony.
--
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
SHANNON ROCKALL MALIN: MAINLY WEST OR SOUTHWEST 6 TO GALE 8, OCCASIONALLY
SEVERE GALE 9. VERY ROUGH OR HIGH. RAIN OR SHOWERS. MODERATE OR GOOD.


Re: Introduction of long term scheduling

2007-01-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Rob Seaman [EMAIL PROTECTED] writes:
: Warner Losh wrote:
:  Anything that makes the math
:  harder (more computationally expensive) can have huge effects on
:  performance in these areas.  That's because the math is done so often
:  that any little change causes big headaches.
:
: Every IP packet has a 1's complement checksum.  (That not all
: switches handle these properly is a different issue.)

Actually, every IP does not have a 1's complement checksum.  Sure,
there is a trivial one that covers the 20 bytes of header, but that's
it.  Most hardware these days off loads checksumming to the hardware
anyway to increase the throughput.  Maybe you are thinking of TCP or
UDP :-).  Often, the packets are copied and therefore in the cache, so
the addition operations are very cheap.

: Calculating a
: checksum is about as expensive (or more so) than subtracting
: timestamps the right way.  I have a hard time believing that epoch-
:  interval conversions have to be performed more often than IP
: packets are assembled.

Benchmarks do not lie.  Also, you are misunderstanding the purpose of
timestamps in the kernel.  Adding or subtracting two of them is
relatively easy.  Converting to a broken down format or doing math
with the complicated forms is much more code intensive.  Dealing with
broken down forms, and all the special cases usually involves
multiplcation and division, when tend to be more computationally
expensive than the checksum.

: One imagines (would love to be pointed to
: actual literature regarding such issues) that most computer time
: handling devolves to requirements for relative intervals and epochs,
: not to stepping outside to any external clock at all.  Certainly the
: hardware clocking of signals is an issue entirely separate from what
: we've been discussing as timekeeping and traceability.  (And note
: that astronomers face much more rigorous requirements in a number of
: ways when clocking out their CCDs.)

Having actually participated in the benchmarks that showed the effects
of inefficient timekeeping, I can say that they have a measurable
effect.  I'll try to find references that the benchmarks generated.

:  Well, the kernel doesn't expect to be able to do that.  Internally,
:  all the FreeBSD kernel does is time based on a monotonically
:  increasing second count since boot.  When time is returned, it is
:  adjusted to the right wall time.
:
: Well, no - the point is that only some limp attempt is made to adjust
: to the right time.

If by some limp attempt you mean returns the correct time then you
are correct.

:  The kernel only worries about leap
:  seconds when time is incremented, since the ntpd portion in the kernel
:  needs to return special things during the leap second.  If there were
:  no leapseconds, then even that computation could be eliminated.  One
:  might think that one could 'defer' this work to gettimeofday and
:  friends, but that turns out to not be possible (or at least it is much
:  more inefficient to do it there).
:
: One might imagine that an interface could be devised that would only
: carry the burden for a leap second when a leap second is actually
: pending.  Then it could be handled like any other rare phenomenon
: that has to be dealt with correctly - like context switching or
: swapping.

You'd think that, but you have to test to see if something was
pending.  And the code actually does that.

:  Really, it is a lot more complicated than just the 'simple' case
:  you've latched onto.
:
: Ditto for Earth orientation and its relation to civil timekeeping.
: I'm happy to admit that getting it right at the CPU level is
: complex.  Shouldn't we be focusing on that, rather than on
: eviscerating mean solar time?

Did I say anything about eviscerating mean solar time?

: A proposal to actually address the intrinsic complications of
: timekeeping is more likely to be received warmly than is a kludge or
: partial workaround.  I suspect it would be a lot more fun, too.

I'm just suggesting that some of the suggested ideas have real
performance issues that means they wouldn't even be considered as
viable options.

:  Kernels aren't written in these languages.  To base one's arugments
:  about what the right type for time is that is predicated on these
:  langauges is a non-starter.
:
: No, but the kernels can implement support for these types and the
: applications can code to them in whatever language.  Again - there is
: a hell of a lot more complicated stuff going on under the hood than
: what would be required to implement a proper model of timekeeping.

True, but timekeeping is one of those areas of the kernel that extra
overhead is called so many times that making it more complex hurts a
lot more than you'd naively think.

Warner