Ah, the hazards of being in different timezones; so much goes on while I am away.
To answer a few things brought up during this thread: I already considered placing the daemon in DEGRADED state. Alas, DEGRADED is not yet implemented in SMF. A pity, since it would have been exactly what was needed. As far as letting daemons query the "quality" of the time, this is a bit tricky. Applications that care fall into a few categories. The first is applications that insist on monotonically increasing time without steps. Before the first correction we know that the clock is unsynchronized and we expect there to be a step coming. There may or may not be a second step coming 5 minutes down the line. Then depending on circumstances, there may be some small steps over the next few hours. After that, if the clock is within 500PPM then there shouldn't be any more steps. If a drift file is in use and valid, then you shouldn't see steps after the first five minutes of startup. It is possible to eliminate the second step as well if the NTP network topology is well thought out. Of course, it gets tricky eliminating steps during leap seconds. Not impossible, just tricky. The second category of application cares about the accuracy of the clock. How can it find this out? The ntpd and xntpd always adjust the clock to match their best idea of the current time. There is no way to determine how close that is to the real time without an independent time source. Time is always relative to something. It is possible to get a measure of the quality of the servers in use. If the time received from the servers is jittery and tends to be different from each other then you know that the accuracy is lower and can be assigned a numerical value to measure. So what? NTP uses state of the art algorithms to condition the clock, if the servers are bad, then they are not likely to get better. You could squawk for attention, but in the general case once the first or second step is past then the applications should get started. By the way, there is already a script included in the NTPv4 distribution for querying the server to see if we are already past the first step. I notice that the current ntp.xml file has these lines: <dependent name='ntp_multi-user' grouping='optional_all' restart_on='none'> <service_fmri value='svc:/milestone/multi-user' /> </dependent> These look like there is already some kind of milestone having to do with NTP startup, but I may be wrong. As far as up stream changes to the NTP project code base. No worries, I am a committer on the project. As long as the changes aren't around the core algorithms (Dr. Mills vets all changes there), anything reasonable should be okay. The changes to put the ntp service in to maintenance mode for offsets greater than 20 minutes are already in there, for instance. Had to get Dr. Mills approval for that one though, because they happened to land in a section of code he vets. As far as the licensing goes, I presume that any code we request an upstream project integrate is donated by the owners (Sun) and will be under the same license as the rest of the project. Isn't that one of the reasons around the dual-ownership in the contributers agreement? I think it is going to get pretty hard to get integration of other projects if we don't do that. Okay, that's all I have to say now. I have to go to the restroom. Can I get a second for that course of action? (please hurry!!) 8-) Nicolas Williams wrote: > On Mon, Aug 27, 2007 at 05:10:09PM -0700, Darren.Reed at Sun.COM wrote: >> Nicolas Williams wrote: >>> Heck, we could do something fairly different that does not involve >>> modifying any NTP code. For example: >>> >> [...] >> >> Indeed, the above does need to be considered...and even >> the architecture of the proposed solution (a daemon to >> manage another daemon?) but... > > BTW, one reason to pursue the above might be that it avoids the code > contribution issue. Not because we don't want to contribute it (hey, it > will be published -- it is _Open_Solaris), but because contributing code > changes can sometimes be very difficult. Or because writing clean code > is easier than modifying old code, or whatever. > > First let's get the semantics straight. What David proposes comes > closest to what I think is right: let the apps set their own tolerance > for time skew and don't block booting. This means that the NTP service > comes online immediately, boot proceeds, and things that can't stand > unsynchronized time refuse to work, mark their services degraded, scream > on /dev/console, or whatever is appropriate for them. And we even get > to represent tolerances in SMF (which is what I wanted when I said that > I wished SMF could represent analog dependencies). > > I think we'll probably converge on this. Then we can figure out other > stuff. > > And no, having daemons watching daemons is not necessarily weird. > That's what svc.startd does, in its own way. There's something to be > said for treating a complex piece of software as a black box. > > Nico -- blu Screening ideas are indeed thought up by the Office for Annoying Air Travelers and vetted through the Directorate for Confusion and Complexity - Kip Hawley, Head of the TSA ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom