Ah, the hazards of being in different timezones; so much goes on while
I am away.

To answer a few things brought up during this thread:

I already considered placing the daemon in DEGRADED state. Alas,
DEGRADED is not yet implemented in SMF. A pity, since it would have
been exactly what was needed.

As far as letting daemons query the "quality" of the time, this is
a bit tricky. Applications that care fall into a few categories.
The first is applications that insist on monotonically increasing time
without steps.

Before the first correction we know that the clock is
unsynchronized and we expect there to be a step coming. There may or
may not be a second step coming 5 minutes down the line. Then depending
on circumstances, there may be some small steps over the next few hours.
After that, if the clock is within 500PPM then there shouldn't be
any more steps. If a drift file is in use and valid, then you shouldn't
see steps after the first five minutes of startup. It is possible to
eliminate the second step as well if the NTP network topology is well
thought out. Of course, it gets tricky eliminating steps during leap
seconds. Not impossible, just tricky.

The second category of application cares about the accuracy of the
clock. How can it find this out? The ntpd and xntpd always adjust the
clock to match their best idea of the current time. There is no way to
determine how close that is to the real time without an independent
time source. Time is always relative to something.

It is possible to get a measure of the quality of the servers in use. If
the time received from the servers is jittery and tends to be different 
from each other then you know that the accuracy is lower and can be
assigned a numerical value to measure. So what? NTP uses state of the 
art algorithms to condition the clock, if the servers are bad, then they
are not likely to get better. You could squawk for attention, but in the
general case once the first or second step is past then the applications
should get started.

By the way, there is already a script included in the NTPv4 distribution
for querying the server to see if we are already past the first step.

I notice that the current ntp.xml file has these lines:

         <dependent
                 name='ntp_multi-user'
                 grouping='optional_all'
                 restart_on='none'>
                 <service_fmri value='svc:/milestone/multi-user' />
         </dependent>

These look like there is already some kind of milestone having to
do with NTP startup, but I may be wrong.

As far as up stream changes to the NTP project code base. No worries,
I am a committer on the project. As long as the changes aren't around
the core algorithms (Dr. Mills vets all changes there), anything 
reasonable should be okay. The changes to put the ntp service in to
maintenance mode for offsets greater than 20 minutes are already in
there, for instance. Had to get Dr. Mills approval for that one though,
because they happened to land in a section of code he vets.

As far as the licensing goes, I presume that any code we request an
upstream project integrate is donated by the owners (Sun) and will
be under the same license as the rest of the project. Isn't that one
of the reasons around the dual-ownership in the contributers agreement?
I think it is going to get pretty hard to get integration of other
projects if we don't do that.

Okay, that's all I have to say now. I have to go to the restroom. Can I
get a second for that course of action? (please hurry!!) 8-)

Nicolas Williams wrote:
> On Mon, Aug 27, 2007 at 05:10:09PM -0700, Darren.Reed at Sun.COM wrote:
>> Nicolas Williams wrote:
>>> Heck, we could do something fairly different that does not involve
>>> modifying any NTP code.  For example:
>>>
>> [...]
>>
>> Indeed, the above does need to be considered...and even
>> the architecture of the proposed solution (a daemon to
>> manage another daemon?) but...
> 
> BTW, one reason to pursue the above might be that it avoids the code
> contribution issue.  Not because we don't want to contribute it (hey, it
> will be published -- it is _Open_Solaris), but because contributing code
> changes can sometimes be very difficult.  Or because writing clean code
> is easier than modifying old code, or whatever.
> 
> First let's get the semantics straight.  What David proposes comes
> closest to what I think is right: let the apps set their own tolerance
> for time skew and don't block booting.  This means that the NTP service
> comes online immediately, boot proceeds, and things that can't stand
> unsynchronized time refuse to work, mark their services degraded, scream
> on /dev/console, or whatever is appropriate for them.  And we even get
> to represent tolerances in SMF (which is what I wanted when I said that
> I wished SMF could represent analog dependencies).
> 
> I think we'll probably converge on this.  Then we can figure out other
> stuff.
> 
> And no, having daemons watching daemons is not necessarily weird.
> That's what svc.startd does, in its own way.  There's something to be
> said for treating a complex piece of software as a black box.
> 
> Nico

-- 
blu

Screening ideas are indeed thought up by the Office for Annoying
Air Travelers and vetted through the Directorate for Confusion
and Complexity - Kip Hawley, Head of the TSA
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Reply via email to