from:"Magnus Danielson"

Re: [ntp:questions] New 60 KHz WWVB Time Format

2013-01-17 Thread Magnus Danielson


Hi Tom,

On 01/16/2013 03:36 PM, Thomas Laus wrote:

I have not seen this information posted to this newsgroup.  The US
NIST radio station WWVB will be changing it's transmission format.  The
information can be found at:

http://www.nist.gov/pml/div688/grp40/wwvb.cfm

The old format is still being sent twice a day until the end of
January 2013, but the station will only transmit the new phase
modulated time code after this month.  It is supposed to be compatible
with the existing 'Atomic' clocks, but I have some of the original
ones that were made in China that are no longer syncing.


The WWVB new format has been covered in several lengthy threads on the 
time-nuts email-list during the last half-year or so. Look in the archives.


Some of the high-precision time and frequency receivers will require 
modifications to handle the new format. Cheap receivers will keep working.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NIST vs. pool.ntp.org ?

2013-03-28 Thread Magnus Danielson


On 03/27/2013 10:45 PM, David Woolley wrote:

Robert Scott wrote:

I am confused about the proper usage of pool.ntp.org and NIST.
pool.ntp.org seems to be a collection of private sector time servers
offered for all to use, but with registration expected for regular


The pool system has no provision for enforcing registration. It wouldn't
make sense to hand out a random server address if most of them then
refused to serve you because you hadn't registered.


users. And NIST has a government run set of time servers. Neither
group (NIST or pool.ntp.org) seems to include or referece the other.


I would hope all the pool servers ultimately reference their national
equivalent of NIST and therefore what becomes, after the fact, UTC.

I think you will find that Navstar (GPS) and WWV times are traceable to
NIST.


Yes and no.

GPS is traceable to USNO. USNO and NIST have traceability between each 
other within the BIPM framework.



MSF times are traceable to NPL.


NPL is traceable to both USNO and NIST within the BIPM framework.


Are they in competition? Who normally uses the NIST servers and who
uses pool.ntp.org?


The open NIST servers are heavily overloaded, so probably don't serve
the highest quality time, but they are likely to be around for a long time.


I would setup a local server under your control. It will help both from 
debugging and noise perspective.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NIST vs. pool.ntp.org ?

2013-03-28 Thread Magnus Danielson


Hi again Robert,

On 03/28/2013 04:22 AM, Robert Scott wrote:

On Thu, 28 Mar 2013 02:50:17 GMT, unruhun...@invalid.ca  wrote:
You really should read my posts before responding.  No, I do not
intend to hard-code NIST or any other server.  I never said I wanted
to.  No, the app is not intended for all musicians.  It is intended
for professional piano tuners only.  I sell about one per day.  And I
never said the pool would not be good enough for my needs.  I only
asked about the relative benefits of the pool vs. NIST, which E-mail
sent...Blacklists answered very nicely.


There is no real benefit in using either, rather you should use the mix 
of servers which gives you good confidence in removing false-tickers as 
well as good precision due to use of short distances.


Look at the NTP code and book, as many of the filtering steps aims at 
removing noise which polute the time and frequency errors. Do the to-way 
time-transfer.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-11 Thread Magnus Danielson

Hi again,

On 08/11/2013 03:36 PM, Magnus Danielson wrote:
 Hi David,

 On 08/11/2013 08:44 AM, David Taylor wrote:
 Today is the start of a new GPS 1024 week epoch - see:

   http://adn.agi.com/GNSSWeb/

 Folks with really old GPS units are reporting problems, those of us
 with current millennium GPS receivers should be OK, though.
 I would word it differently, it's the epoch of a particular line of GPS
 receivers, but not of GPS itself.

 Remember that any Sunday, it is likely that a GPS reciever have slipped
 a multiple of 1024 weeks. NTP drivers should be able to recognice it and
 compensate for it, as it is a re-occuring bug in many recievers.

 This issue have been discussed over and over again at time-nuts.
I forgot to mention that we have already seen HP/Agilent recievers of
the Z3805A and Z3815A generation affected, and that Furuno (maker of the
GPS module in them) have issued a statement relating to this
wrap-around. See the time-nuts list for details.

Any week is potential for older recievers, but this one seems like a
real one.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-14 Thread Magnus Danielson

Hi,

On 08/14/2013 03:54 PM, unruh wrote:
 On 2013-08-14, Mark C. Stephens ma...@non-stop.com.au wrote:
 Um Let's see, Datum was bought by Austron, who was bought by ... etc.
 For collectors such as myself, having this 'mature' equipment still working 
 is great.

 Looking at Mr Malone's code, he added 2 lines which enabled NTPD 
 compatibility with GPS receivers that would have long ago have been sent to 
 the TIP as waste.
 It is however fragile code. Ie, all kinds of situations could arise in
 which it would give the wrong time. Now, you may say that there are
 situations in which it will give the right time when, without the kludge,
 it would give the wrong time.
This addresses a known feature of the GPS system, common over a large
range of receivers. The differences between them lies in which GPS week
they flip over (GPS week 500, 512 and 729 from the top of my head). The
failure they have is not in their operation, but in their production of
a human readable date.

This is what I have proposed elsewhere (on time-nuts) and it is a sound
solution considering the situation we have where the ICD-GPS-200 through
it's many revisions have not provided additional bits for the L1 C/A
code signal. For the L2C (and I assume also L1C, but I haven't checked
yet) signal additional bits exists, but very few recievers have that
support.

I recommend reading the time-nuts backlog on this issue.

Among the alternatives you have, it's ditching an otherwise perfectly
operating GPS receiver or use the fact that the 1024 week wrap-around is
bound to happen, is predictable as a systematic effect from how the GPS
C/A data is structured and re-occurs over the fleet of GPS receivers.

Do note that the GPS receivers does compute leap-second info correctly
regardless of this 1024 offset hickup, as that information is structured
modulu 1024 weeks.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-15 Thread Magnus Danielson

On 08/15/2013 07:55 AM, David Taylor wrote:
 On 14/08/2013 22:07, Harlan Stenn wrote:
 David Malone writes:
 Indeed - you need to have a timestamp within about ten years of
 correct before you start up, otherwise the problem will be worse.  Ntp
 has the same problem in figuring out the ntp epoch, though we've yet
 to see an ntp timestamp wrap around.

 ntp-dev has a fix for this problem - while the original solution was
 make sure the clock is correct to within ~65 years' time the new code
 uses a date of compile value, and needs the system time to be either
 10 years' before that date or up to 128 years' after that date.

 See http://bugs.ntp.org/show_bug.cgi?id=1995 for more information
 (thanks, Juergen!).

 H

 If you make that 9.5 years rather than 10 it might then cover the
 500-week period mentioned by Magnus.
I do not mention a 500 week period. I mention a 1024 week period with
various phases, 500, 512 and obviously 729 (wrapped this Sunday as we
went into week 1753).

 Judging by some reports here, people may be using NTP more than 10
 years old.  Does this fix cause a problem in that case?
Not really. This problem is common mode to recent and 10 year old NTPs.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-15 Thread Magnus Danielson

On 08/15/2013 10:22 AM, David Taylor wrote:
 On 15/08/2013 08:34, Rob wrote:
 David Taylor david-tay...@blueyonder.co.uk.invalid wrote:
 On 14/08/2013 17:44, Rob wrote:
 []
 How does a good receiver know the correct time?  Does it rely on
 local (backed-up) storage, or is there some way of receiving it via
 the almanac?  Or are good receivers hardwired as well, only with
 a different valid span?

 I would not be surprised when good receivers turn out to have just
 a different moment or mode of failure.
 []

 Some receivers have battery backup, in fact all but one of the receiver
 types I use have this.

 Ok but what happens when the battery is replaced?
 []

 Hope and pray?  Wish for a large capacitor or flash-rom?

 I had thought that either ephemeris or almanac data might contain the
 real UTC time, but apparently it does not.  Obviously a system
 designed too far in advance of the Year2000 fuss and bother!
They completely avoid it by not numbering it that way. They have their
own numbering scheme that fit's the system, and the conversion over to
UTC is an added feature. It's all in ICD-GPS-200 for the current set of
details, and in the ION red book series for the early stages.

GPS and GPS problems is best understood if you realize that everything
is counted in the GPS clock machinery with it's own set of gears.
Conversion isn't that hard and it is done every second in the GPS receiver.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-16 Thread Magnus Danielson

On 08/16/2013 05:44 AM, David Taylor wrote:
 On 15/08/2013 21:33, Magnus Danielson wrote:
 []
 They completely avoid it by not numbering it that way. They have their
 own numbering scheme that fit's the system, and the conversion over to
 UTC is an added feature. It's all in ICD-GPS-200 for the current set of
 details, and in the ION red book series for the early stages.

 GPS and GPS problems is best understood if you realize that everything
 is counted in the GPS clock machinery with it's own set of gears.
 Conversion isn't that hard and it is done every second in the GPS
 receiver.

 Cheers,
 Magnus

 Thanks, Magnus.  I've not heard of ICD-GPS-2000 or ION red book
 before.  Perhaps one day I will look them up.
If you go here:
http://www.gps.gov/technical/icwg/

you will find IS-GPS-200G (which is the new name since 2006, I have
failed to adapt) on this link here:
http://www.gps.gov/technical/icwg/IS-GPS-200G.pdf

Using ICD-GPS-200D gives a fair idea of what the older GPS receivers was
designed to meet.

In these documents, the gears of GPS is explained such that you should
be able to implement a correctly working receiver (in principle). There
are a handful of technical details outside of this spec you need to
figure out too, but there are good books for that.
 GPS continues to impress me - I counted and on holiday recently we
 took (at least) 7 GPS receivers - his and hers smart-phones, 2 iPads,
 Garmin GPS 60 CSx, Ventus 750, and one built into my Sony HX200V
 camera!  The Garmin spent much of its time with a puck antenna stuck
 on the cabin porthole plotting our course.
They have gone small now, but you still have L1 C/A only receivers. Many
of them probably does not use carrier phase in any way.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-16 Thread Magnus Danielson

On 08/15/2013 11:02 PM, unruh wrote:
 On 2013-08-15, Magnus Danielson mag...@rubidium.dyndns.org wrote:
 On 08/15/2013 10:22 AM, David Taylor wrote:
 On 15/08/2013 08:34, Rob wrote:
 David Taylor david-tay...@blueyonder.co.uk.invalid wrote:
 On 14/08/2013 17:44, Rob wrote:
 []
 How does a good receiver know the correct time?  Does it rely on
 local (backed-up) storage, or is there some way of receiving it via
 the almanac?  Or are good receivers hardwired as well, only with
 a different valid span?

 I would not be surprised when good receivers turn out to have just
 a different moment or mode of failure.
 []

 Some receivers have battery backup, in fact all but one of the receiver
 types I use have this.
 Ok but what happens when the battery is replaced?
 []

 Hope and pray?  Wish for a large capacitor or flash-rom?

 I had thought that either ephemeris or almanac data might contain the
 real UTC time, but apparently it does not.  Obviously a system
 designed too far in advance of the Year2000 fuss and bother!
 They completely avoid it by not numbering it that way. They have their
 own numbering scheme that fit's the system, and the conversion over to
 UTC is an added feature. It's all in ICD-GPS-200 for the current set of
 details, and in the ION red book series for the early stages.

 GPS and GPS problems is best understood if you realize that everything
 is counted in the GPS clock machinery with it's own set of gears.
 Conversion isn't that hard and it is done every second in the GPS receiver.
 That is fine, but I think that the question is what are those internal
 geers and do those internal geers have a rollover time? Ie, for how
 long a time period is there a unique mapping from the internals of GPS
 and the time (UTC or whatever). Obviously the oscillations of the H
 atoms in the H laser clocks have a rollover of picoseconds. Somewhere in
 those sattelites is some counter with a lot longer period before it
 rolls over. 

As I just answered to David Taylor, it's all described this document:
http://www.gps.gov/technical/icwg/IS-GPS-200G.pdf

You might enjoy reading the earlier revisions as things have been
modified over time, and to understand olrder receivers you need to look
at the older spec, available here:
http://www.gps.gov/technical/icwg/

The 1024 weeks period I have been speaking of comes from interpreting
this document.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-16 Thread Magnus Danielson

On 08/16/2013 10:36 AM, David Taylor wrote:

 Yes, all my receivers are very simple, consumer-level ones.  Sometimes
 I see as low as 2m location accuracy on the GPS 60 CSx, more likely
 3m when walking.

 Thanks for the pointers to the documents.  A pity that they haven't
 been able to find two or three spare bits to reduce the 1024 week
 ambiguity to nearer a half-century or even 100 years.  Oh, well!

If you look at the new signals (L1C, L2C, L5), they have 13-bit Week
Number (WN) compared to the old 10-bit numbers. Adding the bits to the
traditional signal structure would be possible, but would not help if
you have not upgraded to include them. Also, the 8192 week cycle would
also loop eventually, and it would still be a multiple of 1024 weeks off
in that case. However, that's almost 157 years up in 2136/2137 shift. We
know that it's too soon for software folks to fix their code.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-16 Thread Magnus Danielson

On 08/16/2013 03:34 PM, David Taylor wrote:
 On 16/08/2013 13:02, John Hasler wrote:
 David Taylor writes:
 A pity that they haven't been able to find two or three spare bits to
 reduce the 1024 week ambiguity to nearer a half-century or even 100
 years.

  From the Wikipedia article:

 To determine the current Gregorian date, a GPS receiver must be
provided with the approximate date (to within 3,584 days) to
 correctly
translate the GPS date signal. To address this concern the modernized
GPS navigation message uses a 13-bit field that only repeats every
8,192 weeks (157 years), thus lasting until the year 2137 (157 years
after GPS week zero).

 Oh, that /is/ good news, John!  Many thanks.  I couldn't see that from
 a quick scan of the referenced documents, so that's most helpful to know.

 I wonder whether there is any way to determine which satellites are
 sending this modernised message, perhaps they all do, or whether a
 particular receiver is using the full 13-bit field?  It's something
 I've not seen listed in various specifications I've read, but perhaps
 it's taken for granted after a certain date?
None will do that on the L1 C/A signal. It occurs on the new signals
such as L2C and L1C which is code-wise separate from the L1 C/A signal.
None of the traditional receivers will benefit from this shift.

So far I have only seen advanced receivers to receive those signals.
Hopefully things will change.

As I said, even if you add the bits in the signal, just because they are
there, if you haven't upgrade FW which includes it's interpretation, the
GPS receiver will not be able to use it and the problem remains.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-17 Thread Magnus Danielson

On 08/17/2013 06:02 PM, David Taylor wrote:
 On 17/08/2013 09:30, Terje Mathisen wrote:
 David Taylor wrote:
 []
 Thanks for the pointers to the documents.  A pity that they haven't
 been
 able to find two or three spare bits to reduce the 1024 week ambiguity
 to nearer a half-century or even 100 years.  Oh, well!

 That would be even worse:

 Exceptional events like the 10-bit week rollover needs to happen often
 enough that every programmer is forced to write code to handle them
 correctly, or it should not happen at all!

 I.e. a fault-tolerant server setup is only fault-tolerant if you are
 comfortable doing monthly fire drills where you pull the power cord (or
 network cable(s)) from either half.

 For GPS 19+ years was probably intended to be long enough that every
 given receiver would only live to see a single epoch, meaning that a
 simple test against the firmware generation time would suffice, right?

 Well, now we've seen a lot of timing receivers that just keep on
 working, and that 19+ year range turned out to be not quite long enough.

 If this has happened every year or so (i.e. a 64-week rollover), then
 every GS would have had some method to enter the current epoch, and a
 way to remember it across reboots.

 Personally I think the (Trimble?) hack to use the TAI-UTC offset field
 as an epoch guess table index is pretty nice:

 As long as the offset keeps increasing this will suffice to handle at
 least one or two epoch rollovers.

 OTOH, the firmware timestamp method I outlined above will work perfectly
 as long as (a) somebody is still willing to generate new firmware
 versions and (b) you still have some machine with compatible
 hardware/software to allow you to load it onto the GPS.

 Combined remote antenna/GPS receivers with an RS422 or similar
 connection to an NTP server requires that firmware update capability to
 be included in the NTP box. :-(

 Terje

 Thanks for your thoughts, Terje.

 Using a 12-bit (or even 16-bit) field to send the current year would
 be a preferable solution - at least until they start messing with
 leap-seconds and change the whole time scale.  But I take your point -
 once every 19 years it will be remembered a lot more easily than once
 every 76 years.

 Having a multiplicity of different GPS sources from different
 manufacturers may at least improve the chance of the problem being
 spotted.
True. You can make better decisions by looking at more sources, so when
a particular model flips, all the other sources and likelyhood of
flipping makes it reasonable to correct for the systematic effect and
then continue. Most GPS receivers will continue to operate correctly
after a flip, so as long as we correct for the 1024 week flip period, we
can continue to operate.

What might be useful is to store the corrected 1024 weeks offsets, since
if the NTPD is restarted, those corrections can be applied up-front and
then can these corrected values be used to provide good basis for
majority decisions about correct time. When a particular receiver
flips, then it is the only one (possibly a few of them changing at the
same time) which shift by 1024 weeks, and then it is easy to use the
1024-week assumption as a priori knowledge to correct them. When you
wake up the flipped receivers may form a majority, which would be
unfortunate, as we already know they have flipped, but we forgot it in
the re-start process. Doing this, the system integrity can be maintained
throughout.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-18 Thread Magnus Danielson

On 08/18/2013 07:51 AM, David Taylor wrote:
 On 17/08/2013 18:31, Magnus Danielson wrote:
 []
 What might be useful is to store the corrected 1024 weeks offsets, since
 if the NTPD is restarted, those corrections can be applied up-front and
 then can these corrected values be used to provide good basis for
 majority decisions about correct time. When a particular receiver
 flips, then it is the only one (possibly a few of them changing at the
 same time) which shift by 1024 weeks, and then it is easy to use the
 1024-week assumption as a priori knowledge to correct them. When you
 wake up the flipped receivers may form a majority, which would be
 unfortunate, as we already know they have flipped, but we forgot it in
 the re-start process. Doing this, the system integrity can be maintained
 throughout.

 Cheers,
 Magnus

 It will be interesting to see what folks come up with for the patch. 
 I must admit to feeling that a fault in GPS receivers should /not/
 have to be fixed in NTP, but I accept that's likely to be the best
 solution.

 It should certainly be an option that has to be specifically enabled
 (command-line switch or fudge command), and one which has no impact
 otherwise on the reliability and maintainability of NTP.
But this isn't an actual receiver bug in that context. The receivers
can't always to the right thing.

If a receiver has backup-power, then it can remember from different
power-ups the latest year, and that's enough to guess right. Trouble
is that most receivers don't have that installed, and most of them
having it isn't using that knowledge anyway to resolve it.

The problem is that it is a system bug (misfeature really), for which
some receivers have more or less smart ways of dealing with, and we need
to make handle the case when their ways of fixing it does not work
anymore. Thus, we are really talking about patching on a patch, which is
ugly, but if you want continuous operation that is what it takes.

If you want this feature to be disabled by default, you end up with
causing the disruption that the fix is there to avoid. Few will know
that they need to fiddle with that bit, and it becomes a continuous
support thing, rather than letting the default being that it fixes the
problem and then let the really cautious people turn it off. Default
disabled is a bad idea.

Yes, you change the default behaviour of NTP this way, but it's done
because it has been analyzed and it's more likely to fix a problem than
cause a problem for the majority of the users.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Start of new GPS 1024 week epoch

2013-08-18 Thread Magnus Danielson

On 08/18/2013 12:16 PM, Rob wrote:
 David Taylor david-tay...@blueyonder.co.uk.invalid wrote:
 On 18/08/2013 09:19, Magnus Danielson wrote:
 []
 If you want this feature to be disabled by default, you end up with
 causing the disruption that the fix is there to avoid. Few will know
 that they need to fiddle with that bit, and it becomes a continuous
 support thing, rather than letting the default being that it fixes the
 problem and then let the really cautious people turn it off. Default
 disabled is a bad idea.

 Yes, you change the default behaviour of NTP this way, but it's done
 because it has been analyzed and it's more likely to fix a problem than
 cause a problem for the majority of the users.

 Cheers,
 Magnus
 I'm simply saying that I'm happy with NTP as it is now, and that if any 
 /new/ feature is added, it should be optional and disabled by default. 
 The new feature should /only/ apply to GPS sources.
 Perhaps the code should be restructured so that the network time protocol
 remains part of ntpd, and local reference clocks are moved out into
 processes that are more loosely coupled than drivers are now.

 A fix like this belongs in a driver for GPS, not in the main code that
 supports networking and synchronization of the local clock.

 Only the shared memory interface currently has functionality like this,
 and it has some limitations in the information it can convey.  If this
 interface is improved, all the local clock drivers can be moved out
 into separate processes and everyone can tinker his driver to fix problems
 like this one.  It will also be easier to release a fixed driver once
 a problem like this suddenly appears.
This is relevant for any driver interfacing a GPS. It's the correction
of time as it comes into NTPD.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

[ntp:questions] NTPD silently not tracking

2013-08-29 Thread Magnus Danielson

Hi,

We had another incident where a node configured with multiple NTP
sources had an NTPD which when asked with ntpdc have peers, looks like
things are all OK, but with offsets less than a second, while the node
in fact was 6 days off the mark. Only on a number of ntpdc querries did
some of the peers expose a gigantic offset. Everything looked OK, but
time was off such that normal remote login did not work.

The error was way to non-obvious and felt like a Heisenbug in that only
when we looked more carefully at it, it started to see itself that it
was out of touch with reality.

We have now designed a script that warns of an error:

cat /etc/cron.hourly/timechecker
#!/bin/bash
awk 'BEGIN {printf ntpdate -q } ; $1 == server {printf $2 }; END {print 
}' /etc/ntp.conf | bash | awk '$5 == adjust  ( $10  1.0 || $10  -1.0 ) 
{print WARNING: timechecker says that time of host is off by $10 seconds}'

However, this should be addressed in a much more direct manor by NTPD. Have you 
seen this before? Do you have a remedy?

ii  ntp1:4.2.6.p5+d i386 Network Time Protocol daemon and 

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPD silently not tracking

2013-08-30 Thread Magnus Danielson

On 08/30/2013 04:17 AM, E-Mail Sent to this address will be added to the
BlackLists wrote:
 Magnus Danielson wrote:
 We had another incident where a node configured with multiple NTP
 sources had an NTPD which when asked with ntpdc have peers, looks like
 things are all OK, but with offsets less than a second, while the node
 in fact was 6 days off the mark. Only on a number of ntpdc querries did
 some of the peers expose a gigantic offset. Everything looked OK, but
 time was off such that normal remote login did not work.

 The error was way to non-obvious and felt like a Heisenbug in that only
 when we looked more carefully at it, it started to see itself that it
 was out of touch with reality.

 ii  ntp   1:4.2.6.p5+d i386   Network Time Protocol daemon and
 What ntpdc commands did you issue, and what results did you get?

 Did you also try ntpq commands, did you see differing results?
 ntpq -n -c rv 0 leap
 ntpq -n -c rv 0 stratum
 ntpq -n -c rv 0 refid
 ntpq -n -c rv 0 offset
 ntpq -n -c rv 0 rootdisp
Unfortunatly no. I got the call after the fact, but lack of remote login
due to time error would prohibit  me from doing anything anyway. The
server needed to be operational rather than optimize for NTP debugging.
 Have you tried a newer version of NTP ?
 http://www.ntp.org/downloads.html
 http://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-dev/
 http://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-dev/ntp-dev-4.2.7p385.tar.gz
No, I listed the affected version as packaged by Debian.
 Don't use Undisciplined Local Clock 27.127.1.0
  Try Orphan instead is you need LAN NTP clients to stick together
   while LAN and/or Internet NTP servers become unavailable.
  ...
  keys /etc/ntp.keys # e.g. contains: 123 M LAN_MD5_KEY , 321 M Corp_MD5_KEY 
 , ...
  trustedkey 123 321
  tos cohort 1 orphan 10
  restrict source nomodify
  manycastserver  224.0.1.1
  manycastclient  224.0.1.1 key 123 preempt
  ...


It has 2 stratum 1 and 3 stratum 2 unicast servers configured. NTP wise
this machine is a client with 5 configured servers. The problem was that
it was way off time with no apparent indication, which is wrong.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPD silently not tracking

2013-09-01 Thread Magnus Danielson

On 09/01/2013 10:42 PM, unruh wrote:
 On 2013-09-01, Steve Kostecke koste...@ntp.org wrote:
 On 2013-09-01, Rob nom...@example.com wrote:

 The NTP Reference Implementation is free software. The copyright
 holder (The University of Delaware) makes no representations
 about the suitability this software for any purpose. It is
 provided as is without express or implied warranty. Please visit
 http://www.ntp.org/copyright for the complete copyright notice and
 license statement.
 Yes, usual legal ass protection. Fortunately ntpd developers usually do not
 actually either believe that nor act as though they believe that. 
 They tend not to say Oh-- it does not work, tough shit.
 And you do them, and yourself a disservice by saying that that is what
 they do. It is not what they or you do. 

 In this case ntpd wandered off by hours with no complaint. That is not a
 proper behaviour of a professional piece of software. Now it could be
 that they have the local clock enables, and for some reason ntpd chased
 that rather than all of the other server sources. Pointing out that they
 should never actually use the local clock as a source is certainly
 useful since the clock is never wrong with respect to the local source.
 But if the computer has 5 outside source available and still chases
 after the local source that is a bug that should be fixed. If you know
 some attempt was made to fix a bug like than in a more recent version
 than the one used by the user, then advising upgrade is appropriate (as
 is telling him never to use local)
As we are coming back to topic...

8---
# /etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help

driftfile /var/lib/ntp/ntp.drift


# Enable this if you want statistics to be logged.
#statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable


# You do need to talk to an NTP server or two (or three).
#server ntp.your-provider.example

# pool.ntp.org maps to about 1000 low-stratum NTP servers.  Your server will
# pick a different set every time it starts up.  Please consider joining the
# pool: http://www.pool.ntp.org/join.html

server ntp1.kth.se iburst maxpoll 7
server ntp2.kth.se iburst maxpoll 7
server ntp3.kth.se iburst maxpoll 7
server ntp1.sp.se iburst maxpoll 7
server ntp2.sp.se iburst maxpoll 7

# Access control configuration; see
/usr/share/doc/ntp-doc/html/accopt.html for
# details.  The web page
http://support.ntp.org/bin/view/Support/AccessRestrictions
# might also be helpful.
#
# Note that restrict applies to both servers and clients, so a
configuration
# that might be intended to block requests from certain clients could
also end
# up blocking replies from your own upstream servers.

# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery
restrict -6 default kod notrap nomodify nopeer noquery

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

# Clients from this (example!) subnet have unlimited access, but only if
# cryptographically authenticated.
# up blocking replies from your own upstream servers.

# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery
restrict -6 default kod notrap nomodify nopeer noquery

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

# Clients from this (example!) subnet have unlimited access, but only if
# cryptographically authenticated.
#restrict 192.168.123.0 mask 255.255.255.0 notrust


# If you want to provide time to your local subnet, change the next line.
# (Again, the address is an example only.)
#broadcast 192.168.123.255

# If you want to listen to time broadcasts on your local subnet,
de-comment the
# next lines.  Please do this only if you trust everybody on the network!
#disable auth
#broadcastclient
---8

This is the default Debian config file which have been changed to point
out 5 servers, which I was referring to in my follow-up message:

8---

It has 2 stratum 1 and 3 stratum 2 unicast servers configured. NTP wise
this machine is a client with 5 configured servers. The problem was that
it was way off time with no apparent indication, which is wrong.

---8

The debugger (another system admin) of this system did strace, and saw
updates to kernel. Nothing anywhere to indicate problems other than what
I mentioned that there was a zero offset.

I'll try to see if I can re-create this behavior on another machine, as
the machine we did see it on needs to be on time since its a server for
other things than time.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPD silently not tracking

2013-09-02 Thread Magnus Danielson

Hi,

On 09/02/2013 12:39 AM, Harlan Stenn wrote:
 unruh writes:
 In this case ntpd wandered off by hours with no complaint. That is not a
 proper behaviour of a professional piece of software.
 And I don't remember the config file.  If the target platform has a
 config file with setting that change the performance so the described
 behavior is possible, that's not a bug in NTP.

 Now it could be that they have the local clock enables, and for some
 reason ntpd chased that rather than all of the other server
 sources. Pointing out that they should never actually use the local
 clock as a source is certainly useful since the clock is never wrong
 with respect to the local source.
 Again, if this is what happened and the config file directives make this
 the stated behavior, that's not a bug in the code.  That's a
 configuration file problem.
Giving reasonable warnings when the config will cause a system to be
isolated is however expected. You can't expect all the users deeply
understand all twists of the configuration. A particular problem with a
software like NTPD is that so many recommendations from so many
different ages is floating around. Comprehend the full documentation can
be daunting. So it's not as easy as saying it is a configuration problem
rather than a software problem, sometimes the art of configuration of
the software IS the problem.
 But if the computer has 5 outside source available and still chases
 after the local source that is a bug that should be fixed. If you know
 some attempt was made to fix a bug like than in a more recent version
 than the one used by the user, then advising upgrade is appropriate (as
 is telling him never to use local)
 Sure, and we will need to see the bug duplicated on the latest version
 of whatever branch has the problem, and with 4.2.8 nearly ready for
 release and with almost no chance of another 4.2.6 release, it makes
 sense for folks to focus on the latest -dev code.

 If some volunteer feels like working on this for older code that's
 great, and if somebody wants active support for older code that is
 available too.
The affected machine was a server for other things and was a client for
NTP time. There where very limited time to fool around on that machine
with latest and greatest code or whatever comes recommended hours/days
after we encountered the problem, and the core behavior was so major
that not reporting it when we saw it again would have been bad. We had
to focus on getting our system into operational state again, as that is
our primary task. I have not had the time to setup another machine to
replicate the problem with that or any other version of code.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPD silently not tracking

2013-09-02 Thread Magnus Danielson

On 09/02/2013 02:33 PM, David Lord wrote:
 Harlan Stenn wrote:
 David Lord writes:
 Magnus Danielson wrote:
 server ntp1.kth.se iburst maxpoll 7
 server ntp2.kth.se iburst maxpoll 7
 server ntp3.kth.se iburst maxpoll 7
 server ntp1.sp.se iburst maxpoll 7
 server ntp2.sp.se iburst maxpoll 7
 that seems too restrictive and possibly abusive if you do not
 yourself have control over those servers.

 iburst is not abusive.

 Perhaps you are thinking of burst?

 I was thinking about maxpoll 7 and the few stats that were
 given indicating the very poor reach for the configured
 servers.
There is good network connectivity to all 5 servers.

If you advice us not to use maxpoll 7, then we naturally will learn from
it. I don't use it personally, but I didn't set this machine up. Would
be nice to hear your explanation thought.

However, when doing the ntpdc peers command (in interactive mode), it
had all 5 servers available, and was tracking one (as indicated with =
and * at the beginning of the lines, I was told this over phone, so I
don't have visual memory of it all). So, I don't think bad connectivity
was the cause. It looked to a non-NTP expert like it had peers, was
happy with offsets (albeit it looked unexpectedly good at 0) but just
was plain way off in time.  It took multiples querries with ntpdc peers
before it reacted on the time-offset, started to display big offsets and
eventually clean up itself. ntpdate -q did expose the time error of 6 days.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPD silently not tracking

2013-09-02 Thread Magnus Danielson

On 09/02/2013 03:49 AM, unruh wrote:
 On 2013-09-01, Magnus Danielson mag...@rubidium.dyndns.org wrote:
 server ntp1.kth.se iburst maxpoll 7
 server ntp2.kth.se iburst maxpoll 7
 server ntp3.kth.se iburst maxpoll 7
 server ntp1.sp.se iburst maxpoll 7
 server ntp2.sp.se iburst maxpoll 7

 # Access control configuration; see
 /usr/share/doc/ntp-doc/html/accopt.html for
 # details.  The web page
 http://support.ntp.org/bin/view/Support/AccessRestrictions
 I do hope that was really all on the same line, or there was a # at the
 start of that second line.
 Otherwise ntpd will be confused. 
No worries. That mishap came in the copy-and-past between less
/etc/ntp.conf in one window any my email client.
 This is the default Debian config file which have been changed to point
 out 5 servers, which I was referring to in my follow-up message:

 8---

 It has 2 stratum 1 and 3 stratum 2 unicast servers configured. NTP wise
 this machine is a client with 5 configured servers. The problem was that
 it was way off time with no apparent indication, which is wrong.
 Agreed. Noone is arguing it is right. The question is why. You do not
 seem to be using the local refclock, so that is one explanation gone. 
Seemed strange to see those comments, as I had already said otherwise.
 None of those servers happens to be the machine itself do they? Of
 progeny of that server?
No. This is a server, but not of NTP. NTP-wise it is a client.

The three stratum 2 servers are local, and the stratum 1 servers are
national well known servers.
 And looking at those log files around the time things go bad might be
 suggestive. 

 Exactly which version of ntpd, and you are sure that someone has not
 made improvements to it?
If you read my initial message, you would have seen this:

ii  ntp1:4.2.6.p5+d i386 Network Time Protocol daemon and 

which is the result of running dpkg -l ntp on that Debian system. We don't 
have time to improve things with local patches. We might be accused of 
misconfiguration.

I made a report here, in hope you could make more sense of the behavior than 
the normal Debian packet maintainer.

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-11-03 Thread Magnus Danielson

On 11/03/2013 12:43 AM, David Woolley wrote:
 On 02/11/13 21:48, David Lord wrote:


 Ntpd writes to its drift file and also ntp.log. The drift file
 is critical and is used and updated at intervals by ntpd.

 The drift file is an optimisation.  ntpd should work without it, but
 will take longer to acquire lock after a restart.

 What would cause more problems would be a drift file that was present,
 but read-only, as ntpd would skip its frequency calibration and trust
 the frozen value in that file, then suffer wild swings as it begins to
 discover the value was wildly wrong.
If the oscillator drifts from last drift-file write, outside of +/- 15
ppm if I recall it, it fails to lock in again.
It would be good if it could bail out and do normal frequency
acquisition if that occurs.

That particular feature have bitten hard, and was a side-consequence
of other faults, but none-the-less.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-11-03 Thread Magnus Danielson

On 11/03/2013 06:26 PM, Magnus Danielson wrote:
 On 11/03/2013 12:43 AM, David Woolley wrote:
 On 02/11/13 21:48, David Lord wrote:

 Ntpd writes to its drift file and also ntp.log. The drift file
 is critical and is used and updated at intervals by ntpd.

 The drift file is an optimisation.  ntpd should work without it, but
 will take longer to acquire lock after a restart.

 What would cause more problems would be a drift file that was present,
 but read-only, as ntpd would skip its frequency calibration and trust
 the frozen value in that file, then suffer wild swings as it begins to
 discover the value was wildly wrong.
 If the oscillator drifts from last drift-file write, outside of +/- 15
 ppm if I recall it, it fails to lock in again.
 It would be good if it could bail out and do normal frequency
 acquisition if that occurs.

 That particular feature have bitten hard, and was a side-consequence
 of other faults, but none-the-less.
By request from Harlan, I put this into a bug-report:
http://bugs.ntp.org/show_bug.cgi?id=2500

Hope it was clear enough.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-11-04 Thread Magnus Danielson

Antonio,

On 11/04/2013 06:40 PM, Antonio Marcheselli wrote:
 If the oscillator drifts from last drift-file write, outside of +/- 15
 ppm if I recall it, it fails to lock in again.
 It would be good if it could bail out and do normal frequency
 acquisition if that occurs.

 That particular feature have bitten hard, and was a side-consequence
 of other faults, but none-the-less.

 Thanks Magnus, I saw the bug report you filed.

 Would it be wiser to delete the drift file at boot - by script - and
 let ntpd resync and recreate a new drift file?

 As mentioned, I don't really need my system to be synced down to the
 millisecond, if ntpd takes a few hours to settle and the time is off
 up to a few seconds during that time it's perfectly fine with me.
If you don't want the bootstrap feature of drift-file, don't specify it
in the configuration is much wiser than deleting it.

It's good to have this acceleration, if we can make it to be fool-proof.
This is an attempt to get in that direction.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-12-06 Thread Magnus Danielson

On 12/06/2013 10:53 AM, Harlan Stenn wrote:
 mike cook writes:
 If you know the drift file is unreliable, you should delete it.  ntpd
 will then perform a frequency calibration before entering the main
 loop. ...
 This is what has been recommended for ages but it doesn't completely
 fix the issue. It still takes a long time to settle. Here are the
 results of a test I did using the same system and ntp config as in my
 previous reply wit h the unrepresentative drift file data.
 An unrepresentative drift file is not a deleted drift file.
I filed a bug to address this. If the drift file is obviously nuts,
ignore it for speed-up and just work as it was not there, that is, do
normal frequency lock-in.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-12-06 Thread Magnus Danielson

On 12/07/2013 01:17 AM, unruh wrote:
 On 2013-12-06, Magnus Danielson mag...@rubidium.dyndns.org wrote:
 On 12/06/2013 10:53 AM, Harlan Stenn wrote:
 mike cook writes:
 If you know the drift file is unreliable, you should delete it.  ntpd
 will then perform a frequency calibration before entering the main
 loop. ...
 This is what has been recommended for ages but it doesn't completely
 fix the issue. It still takes a long time to settle. Here are the
 results of a test I did using the same system and ntp config as in my
 previous reply wit h the unrepresentative drift file data.
 An unrepresentative drift file is not a deleted drift file.
 I filed a bug to address this. If the drift file is obviously nuts,
 ignore it for speed-up and just work as it was not there, that is, do
 normal frequency lock-in.
 How does it know that the drift file is obviously nuts? 

 If it knew that it could fix it. It does not know that. ntpd ONLY knows
 the current offset. Now on bootup if there is not drift file, then it
 tries to remember the past few offsets and use those to estimate a
 drift, but if there is a drift file, it trusts the value in that drift
 file. If you are always going to do a drift estimate for the first few
 polls anyway, why have a drift file at all?
Well, we can discuss which is the best way to detect it, but when you
fail to lock and is forced to re-set the time, then you surely know you
didn't where were you expected to be.

The drift-file-accelerated lock-in isn't robust. Current behavior of
response isn't very useful for most people experiencing it.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP not syncing

2013-12-07 Thread Magnus Danielson

On 12/07/2013 11:39 PM, Harlan Stenn wrote:
 Magnus Danielson writes:
 The drift-file-accelerated lock-in isn't robust. Current behavior of
 response isn't very useful for most people experiencing it.
 I'm not sure I'd agree with the word most.  It's certainly worked very
 well on hundreds of machines where I've run it, and the feedback I've
 had from people when I've told them about iburst and drift files has
 been positive except when they've had Linux kernels that calculate a
 different clock frequency on a reboot.
Experiencing the problem that is. When it works, it's a lovely tool.
Sorry if the wording was unclear in that aspect.
 There are at least 2 other issues here.

 One goes to robust, and yes, we can do better with that.  It's not yet
 clear to me that in the wider perspective this effort will be worthwhile.
Well, you can either choose a rather simple back-out method or if you
think it is worthwhile a more elaborate method. Getting cyclic re-set of
time is a little to coarse a method. I think it is better to back-out
and one way or another recover phase and frequency.
 The other goes to the amount of time it takes to adequately determine
 the offset and drift.

 With a good driftfile and iburst, ntpd will sync to a handful of PPM in
 about 11 seconds' time.

 We've been working on a project to produce sufficiently accurate offset
 and drift measurements at startup time, and the main problem here is
 that it can take minutes to figure this out well, and there is a
 significant need to get the time in the right ballpark at startup in
 less than a minute.  These goals are mutually incompatible.  The intent
 is to find a way to get there as well as possible, as quickly as
 possible.
Getting the time in the right ball-park is by itself not all that hard.
However, frequency takes time to learn and getting phase errors down
quickly becomes an issue. NTP has as far as I have seen reduced loop
bandwidth and at the same time reduced the capture range, and whenever
you reduce the capture range you need to have heuristics to make sure
you back-out if things get upset. Recovery of old state is good, but one
needs to make sure that you don't loose that robustness.

As for method of locking in quickly, that can be debated on in length.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Meinberg Configuration Help

2014-03-02 Thread Magnus Danielson


On 02/03/14 19:31, William Unruh wrote:

On 2014-03-02, Brian Inglis brian.ing...@systematicsw.ab.ca wrote:

On 2014-03-01 15:43, boostinbad...@gmail.com wrote:

My NTP server is part of the pool project and appears to be running fine.  
Comcast contacted me about a month ago to let me know that my NTP server was 
infected with a bot.  I checked and everything seems to be ok.  I re-enabled my 
server about a week ago and I received another phone call last week concerning 
security on my network.
I contacted Ask and he said that it was not a bot but an issue with my server 
allowing management requests.  I asked Ask how to properly configure my 
Meinberg client to not allow management requests because I understand that they 
can be problematic.  I know the config for ntpd but I am not sure of the proper 
syntax for Meinberg.  Can someone provide me with that info?


Banner on http://support.ntp.org links to
http://support.ntp.org/bin/view/Main/SecurityNotice#DRDoS_Amplification_Attack_using
and recommends restrict default noquery [and possibly other no... options]
or you could use restrict default ignore; also add disable monitor.


And why those are not the default I will never know. They should never
have been on by default-- the problem was obvous 15 years ago, if
nothing else in giving an attacker knowledge about your system.
Things which go out to the  broad internet should be off by default, and be
switched on by the user who needs them.
Just as ntpd does not have a list of servers it uses by default, but I
guess people running ntp servers got burned by that one 20 years ago.


There is a complete new generation of sys-admins since then.
well known among those so skilled in the art does not mean active 
knowledge amongst users. This might be a lesson to remember.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Asymmetric Delay and NTP

2014-03-17 Thread Magnus Danielson


Joe,

On 16/03/14 23:16, Joe Gwinn wrote:

I recall seeing something from Dr. Mills saying that a formal proof had
been found showing that no packet-exchange protocol (like NTP) could
tell delay asymmetry from clock offset.  Can anyone provide a reference
to this proof?


It's relative simple.

You have two nodes (A and B) and a link in each direction (A-B and B-A).

You have three unknowns, the time-difference between the nodes (T_B - 
T_A), the delay from node A to B (d_AB) and the delay from node B to A 
(d_BA).


You make two observations of the pseudo-range from node A to node B 
(t_AB) and from node B to node A (t_BA). These are made by the source 
announcing it's time and the receiver time-stamping in it's own time 
when it occurs.


t_AB = T_B - T_A + d_AB
t_BA = T_A - T_B + d_BA

We thus have three unknowns and two equations. You can't solve that.
For each link you add, you add one observation and one unknown. For each 
node and two links you add, you add three unknowns and two observations. 
You can't win this game.


There are things you can do. Let's take out observations and add them, 
then we get


RTT = t_AB + t_BA = (T_B - T_A) + d_AB + (T_A - T_B) + d_BA
= d_AB + d_BA

Now, that is useful. If we diff them we get

/|T = t_AB - t_BA = (T_B - T_A) + d_AB - (T_A - T_B) - d_BA
= 2(T_B - T_A) + d_AB - d_BA

TE = /|T / 2 = T_B - T_A + (d_AB - d_BA)/2

So, diffing them gives the time-difference, plus half the asymmetric delay.
If we assume that the delay is symmetric, then we can use these measures 
to compute the time-difference between the clocks, and if there is an 
asymmetry, it *will* show up as a bias in the slave clock.


The way to come around this is to either avoid asymmetries like the 
plauge, find a means estimate them (PTPv2) or calibrate them away.


Is that formal enough for you?

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-17 Thread Magnus Danielson


On 17/03/14 09:48, Martin Burnicki wrote:

William Unruh wrote:

On 2014-03-16, Joe Gwinn joegw...@comcast.net wrote:

I keep seeing claims that Precision Time Protocol (IEEE 1588-2008) can
achieve sub-microsecond to nanosecond-level synchronization over
ethernet (with the right hardware to be sure).

I've been reading IEEE 1588-2008, and they do talk of one nanosecond,
but that's the standard, and aspirational paper is not practical
hardware running in a realistic system.


1ns is silly. However 10s of ns are possible. It is achieved by Radio
Astronomy networks with special hardware (but usually post facto)


Why should 1 ns be silly?

If you have a counter chain clocked by 20 MHz then the timestamps
captured when PTP packets are going out or are coming in have a
resolution of 50 ns.

If your hardware can be clocked a 1 GHz then the resolution could be
increase to 1 ns.

Of course I know high resolution is not the only thing you need for high
accuracy, but it's a precondition.

You'd need hardware (FPGA?) which can be clocked at 1 GHz, and even in
the hardware signal processing you'd need to account for a number of
signal propagation delays which you can eventually ignore at lower clock
rates.

So of course the effort becomes much higher if you want more accuracy,
but this is always the case, even if you compare NTP to the time
protocol, or PTP to NTP.


You don't need to count at 1 GHz, you can achieve the resolution with 
*much* lower frequencies. One pair of counters I have achieve 2,7 ps 
single-shot resolution using 90 MHz clock. Interpolators does the trick. 
There is many ways to interpolate.


Achieving the necessary resolution then turns into the troublesome issue 
of precision, which require calibrations and systematic studies.



I've seen some papers reporting tens to hundreds of nanoseconds average
sync error, but for datasets that might have 100 points, and even then
there are many outliers.

I'm getting PTP questions on this from hopeful system designers.  These
systems already run NTP, and achieve millisecond level sync errors.


Uh, perhaps show them to achievement of microsecond level sync errors?
That is already a factor of 1000 better than they achieve.

One of the key problems is getting the packets onto the network (delays
withing the ethernet card) special hardware on teh cards which
timestamps the sending and receiveing of packets on both ends could do
better.a But it also depends on the routers and switches between the two
systems.


Of course all involved network nodes needed to be able to timestamp at
this high resolution, otherwise the overall accuracy would be degraded.

And, it would probably be easier to achieve this accuracy with an
embedded device with dedicated hardware than with a a standard PC and a
NIC connected via the PCI bus.


There is a whole myriad of issues you end up with when you try to get 
down that low.



If there were a 1 GHz oscillator on the NIC used for timestamping then
you still have to provide a way to relate the timestamps from the NIC to
your local system time. If the only way to do this is via the (PCI?) bus
then the accuracy could suffer from bus latency, arbitration, etc.


No go.


On a dedicated hardware the same oscillator/high resolution counter
chain could be used for system timekeeping, and to timestamp network
packets, which makes things much easier.


You end up with quite dedicated hardware if you want to go there, yes.
Regardless of how you do it.

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-17 Thread Magnus Danielson


On 17/03/14 13:50, Joe Gwinn wrote:

In article lg61s4$ong$3...@dont-email.me, William Unruh
un...@invalid.ca wrote:


On 2014-03-16, Joe Gwinn joegw...@comcast.net wrote:

I keep seeing claims that Precision Time Protocol (IEEE 1588-2008) can
achieve sub-microsecond to nanosecond-level synchronization over
ethernet (with the right hardware to be sure).

I've been reading IEEE 1588-2008, and they do talk of one nanosecond,
but that's the standard, and aspirational paper is not practical
hardware running in a realistic system.


1ns is silly. However 10s of ns are possible. It is achieved by Radio
Astronomy networks with special hardware (but usually post facto)


IEEE 1588-2008 does say one nanosecond, in section 1.1 Scope.

I interpret it as aspirational - one generally makes a hardware
standard somewhat bigger and better than current practice, so the
standard won't be too soon outgrown.  IEEE standards time out in five
years, unless revised or reaffirmed.



I've seen some papers reporting tens to hundreds of nanoseconds average
sync error, but for datasets that might have 100 points, and even then
there are many outliers.

I'm getting PTP questions on this from hopeful system designers.  These
systems already run NTP, and achieve millisecond level sync errors.


Uh, perhaps show them to achievement of microsecond level sync errors?
That is already a factor of 1000 better than they achieve.


I forgot to mention a key point.  We also have IRIG hardware, which
does provide microsecond level sync errors.  The hope is to eliminate
the IRIG hardware by using the ethernet network that we must have
anyway.


IRIG-B004 DCLS can provide really good performance if you let it.

To get *good* PTP performance, comparable to your IRIG-B, prepare to do 
a lot of testing to find the right Ethernet switches, and then replace 
them all. Redoing the IRIG properly start to look like cheap and 
straight-forward.



One of the key problems is getting the packets onto the network (delays
withing the ethernet card) special hardware on the cards which
timestamps the sending and receiveing of packets on both ends could do
better.  But it also depends on the routers and switches between the two
systems.


Yes.  My question is basically a query about the current state of the
art.


The state of the art is not yet standard and not yet off the shelf 
products, if you want to call it PTP.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Asymmetric Delay and NTP

2014-03-18 Thread Magnus Danielson


On 18/03/14 01:36, Joe Gwinn wrote:

In article 5327757e.5040...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


Joe,

On 16/03/14 23:16, Joe Gwinn wrote:

I recall seeing something from Dr. Mills saying that a formal proof had
been found showing that no packet-exchange protocol (like NTP) could
tell delay asymmetry from clock offset.  Can anyone provide a reference
to this proof?


It's relative simple.

You have two nodes (A and B) and a link in each direction (A-B and B-A).

You have three unknowns, the time-difference between the nodes (T_B -
T_A), the delay from node A to B (d_AB) and the delay from node B to A
(d_BA).

You make two observations of the pseudo-range from node A to node B
(t_AB) and from node B to node A (t_BA). These are made by the source
announcing it's time and the receiver time-stamping in it's own time
when it occurs.

t_AB = T_B - T_A + d_AB
t_BA = T_A - T_B + d_BA

We thus have three unknowns and two equations. You can't solve that.
For each link you add, you add one observation and one unknown. For each
node and two links you add, you add three unknowns and two observations.
You can't win this game.

There are things you can do. Let's take out observations and add them,
then we get

RTT = t_AB + t_BA = (T_B - T_A) + d_AB + (T_A - T_B) + d_BA
  = d_AB + d_BA

Now, that is useful. If we diff them we get

/|T = t_AB - t_BA = (T_B - T_A) + d_AB - (T_A - T_B) - d_BA
  = 2(T_B - T_A) + d_AB - d_BA

TE = /|T / 2 = T_B - T_A + (d_AB - d_BA)/2

So, diffing them gives the time-difference, plus half the asymmetric delay.
If we assume that the delay is symmetric, then we can use these measures
to compute the time-difference between the clocks, and if there is an
asymmetry, it *will* show up as a bias in the slave clock.

The way to come around this is to either avoid asymmetries like the
plague, find a means estimate them (PTPv2) or calibrate them away.

Is that formal enough for you?


It may be.  This I did know, and would seem to suffice, but I recall a
triumphant comment from Dr. Mills in one of his documentation pieces.
Which I cannot recall well enough to find.  It may be the above
analysis that was being referred to, or something else.


I can't recall. The above I came up with myself some 10 years ago or so.

Will see if I can find Dave's reference.


I also took the next step, which is to treat d_AB and d_BA as random
variables with differing means and variances (due to interference from
asymmetrical background traffic), and trace this to the effect on clock
sync.  It isn't pretty on anything like a nanosecond scale.  The
required level of isolation between PTP traffic and background traffic
is quite stringent.


It's even worse when you get into packet networks, as the delays contain 
noise sources of variable mean and variable deviation, besides being 
asymmetrical. NTP combats some of that, but doesn't get deep enough due 
to too low packet rate. PTP may do it, but it's not in the standard so 
it will be propritary algorithms. The PTP standard is a protocol 
framework. ITU have spent time to fill in more of the empty spots.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-18 Thread Magnus Danielson


On 18/03/14 01:24, Joe Gwinn wrote:

In article 532778bf.50...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


On 17/03/14 13:50, Joe Gwinn wrote:

In article lg61s4$ong$3...@dont-email.me, William Unruh
un...@invalid.ca wrote:


On 2014-03-16, Joe Gwinn joegw...@comcast.net wrote:

I keep seeing claims that Precision Time Protocol (IEEE 1588-2008) can
achieve sub-microsecond to nanosecond-level synchronization over
ethernet (with the right hardware to be sure).

I've been reading IEEE 1588-2008, and they do talk of one nanosecond,
but that's the standard, and aspirational paper is not practical
hardware running in a realistic system.


1ns is silly. However 10s of ns are possible. It is achieved by Radio
Astronomy networks with special hardware (but usually post facto)


IEEE 1588-2008 does say one nanosecond, in section 1.1 Scope.

I interpret it as aspirational - one generally makes a hardware
standard somewhat bigger and better than current practice, so the
standard won't be too soon outgrown.  IEEE standards time out in five
years, unless revised or reaffirmed.



I've seen some papers reporting tens to hundreds of nanoseconds average
sync error, but for datasets that might have 100 points, and even then
there are many outliers.

I'm getting PTP questions on this from hopeful system designers.  These
systems already run NTP, and achieve millisecond level sync errors.


Uh, perhaps show them to achievement of microsecond level sync errors?
That is already a factor of 1000 better than they achieve.


I forgot to mention a key point.  We also have IRIG hardware, which
does provide microsecond level sync errors.  The hope is to eliminate
the IRIG hardware by using the ethernet network that we must have
anyway.


IRIG-B004 DCLS can provide really good performance if you let it.

To get *good* PTP performance, comparable to your IRIG-B, prepare to do
a lot of testing to find the right Ethernet switches, and then replace
them all. Redoing the IRIG properly start to look like cheap and
straight-forward.


I've used IRIG-B004 DCLS before, for cables two meters long within a
cabinet.  Worked well.  How well do they handle 100 meter cables, in
areas where the concept of ground can be elusive?


The rising edge of the 100 Hz is your time reference, the falling edges 
is your information. Proper signal conditioning and cabling should not 
be a problem given proper drivers and receivers.


IRIG-B004 DCLS also travels nicely over optical connections, and 
grounding issues will be less of a problem. Known to work well in power 
sub-stations, so there can be off the shelf products if you look for them.



This is for proposed new systems, so there are no switches to replace.

In response to questions from hopeful engineers, I had already made the
point about the need for serious testing, with asymmetrical loads a
factor larger than the real system will sustain.  I'm not sure they are
convinced of the need.

Anyway, the hope is that PTP will be simpler and cheaper than having
multiple IRIG systems, assuming that one starts from scratch.


Maybe, depends on your needs. Consider doing a separate network for PTP.
That approach have been used in systems where you want to make sure it 
works.



One of the key problems is getting the packets onto the network (delays
withing the ethernet card) special hardware on the cards which
timestamps the sending and receiveing of packets on both ends could do
better.  But it also depends on the routers and switches between the two
systems.


Yes.  My question is basically a query about the current state of the
art [in PTP].


The state of the art is not yet standard and not yet off the shelf
products, if you want to call it PTP.


This is my fear and instinct.  But people read the adverts and will
continue to ask.  And some customers will demand.  So, I'm digging
deeper.

Are there any good places to start?


You asked here, it's not the worst place to start. :)

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-18 Thread Magnus Danielson


On 18/03/14 02:45, Paul wrote:

On Mon, Mar 17, 2014 at 9:33 PM, Joseph Gwinn joegw...@comcast.net wrote:


Will it do 100 meters or more, in bad neighborhoods?



I'm not the right person to ask but since it is expected to maintain
between 2.5 and 100 nanosecond  sync with CPE nodes (cable modems) I assume
it requires RF techniques not readily available (or cost effective) outside
a cable plant.


The DOCSIS time interface is fun in that it uses two different 
frequencies to provide the transfer, so you get an interpolation 
function from relatively benign frequencies. Will make your crystal 
supplier happy.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-18 Thread Magnus Danielson


On 18/03/14 09:59, Martin Burnicki wrote:

Magnus Danielson wrote:

On 17/03/14 09:48, Martin Burnicki wrote:

You'd need hardware (FPGA?) which can be clocked at 1 GHz, and even in
the hardware signal processing you'd need to account for a number of
signal propagation delays which you can eventually ignore at lower clock
rates.

So of course the effort becomes much higher if you want more accuracy,
but this is always the case, even if you compare NTP to the time
protocol, or PTP to NTP.


You don't need to count at 1 GHz, you can achieve the resolution with
*much* lower frequencies. One pair of counters I have achieve 2,7 ps
single-shot resolution using 90 MHz clock. Interpolators does the trick.
There is many ways to interpolate.


Agreed. I just thought the way to use a higher counter clock is more
obvious. All depends on how accurate and precise you can get your
timestamps, and this is probably easier with network packet timestampers
at both sides of a cable than with a wireless time transfer method like
GPS which usually suffers from delays which can't easily be measured,
like ionospheric delays. And yes, I know that this can be improved if
you receive 2 GPS frequencies instead of only the L1. ;-)


Indeed. If you read the right article from 1990 you also know you can do 
it on L1 C/A only by monitoring both code and carrier phase, as their 
ionospheric effect have opposite signs.


Carrier phase by the way is a good illustration that the frequency you 
have (1,57542 GHz) is not the limiting factor, as you can make 
observations to within 1/100 of the cycle, or about 2 mm. The precision 
and accuracy needs *tons* of processing to get in that neighborhood, 
especially since the tropospherical delay is hard to estimate and 
compensate for in a single receiver.


Anway, resolution and counter frequency is and remains two different 
things. Precision measurements can be made using two lower frequency 
signals.



Achieving the necessary resolution then turns into the troublesome issue
of precision, which require calibrations and systematic studies.


I've seen some papers reporting tens to hundreds of nanoseconds
average
sync error, but for datasets that might have 100 points, and even then
there are many outliers.

I'm getting PTP questions on this from hopeful system designers.
These
systems already run NTP, and achieve millisecond level sync errors.


Uh, perhaps show them to achievement of microsecond level sync errors?
That is already a factor of 1000 better than they achieve.

One of the key problems is getting the packets onto the network (delays
withing the ethernet card) special hardware on teh cards which
timestamps the sending and receiveing of packets on both ends could do
better.a But it also depends on the routers and switches between the
two
systems.


Of course all involved network nodes needed to be able to timestamp at
this high resolution, otherwise the overall accuracy would be degraded.

And, it would probably be easier to achieve this accuracy with an
embedded device with dedicated hardware than with a a standard PC and a
NIC connected via the PCI bus.


There is a whole myriad of issues you end up with when you try to get
down that low.


Yep.


If there were a 1 GHz oscillator on the NIC used for timestamping then
you still have to provide a way to relate the timestamps from the NIC to
your local system time. If the only way to do this is via the (PCI?) bus
then the accuracy could suffer from bus latency, arbitration, etc.


No go.


On a dedicated hardware the same oscillator/high resolution counter
chain could be used for system timekeeping, and to timestamp network
packets, which makes things much easier.


You end up with quite dedicated hardware if you want to go there, yes.
Regardless of how you do it.


Standard PC hardware hasn't been designed for timekeeping at highest
accuracy, neither the cheap crystal oscillator usually assembled on the
mainboard, nor the missing hard link between hardware on a PCI card
and the CPU, nor the TSCs often used for timekeeping which may suffer
from changes of the CPU clock frequency (with older CPU types) or
changes of the front bus clock frequency (with newer CPU types).


Indeed. In the old days, you could accurately count your clock cycles in 
your assembler and a single common clock for the full machine. No such 
luck anymore.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-18 Thread Magnus Danielson


On 18/03/14 10:17, Martin Burnicki wrote:

Paul wrote:

On Mon, Mar 17, 2014 at 8:07 PM, Joe Gwinn joegw...@comcast.net wrote:


People are also lusting after sub-microsecond sync.



Sure but not optimally in comp.protocols.ntp/questions@lists.ntp.org.
With some help NTP can be quite good but the intent really isn't
nanosecond
accuracy.


We have mades some tests and found that NTP can yield the same accuracy
as NTP if also hardware timestamping of NTP packets is supported on all
nodes, similar as for PTP.

In fact this isn't surprising, is it?


No, it's not. NTP is being perceived to be software timestamping but 
nothing prohibits you from doing it in hardware. Similarly can you 
implement PTP with software time-stamping (with shitty performance).


Doing HNTP makes NTP match up against PTPv1 to some degree, but PTP then 
pulls out the explicit means to make PTP-aware transparent clocks to 
correct for delays, cancelling some of the asymmetry. You could do NTP 
with PTP 2-step processing, but what we would call such a bastard would 
be an interesting thing, NPTP?


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-20 Thread Magnus Danielson


On 19/03/14 10:43, Martin Burnicki wrote:

Magnus Danielson wrote:

On 18/03/14 10:17, Martin Burnicki wrote:

We have mades some tests and found that NTP can yield the same accuracy
as NTP if also hardware timestamping of NTP packets is supported on all
nodes, similar as for PTP.

In fact this isn't surprising, is it?


No, it's not. NTP is being perceived to be software timestamping but
nothing prohibits you from doing it in hardware. Similarly can you
implement PTP with software time-stamping (with shitty performance).


As I mentioned in a different posting, even if you use hardware
timestamping with NTP you are out of luck for highest accuracy since
there are (AFAIK) no switches which have been designed to timestamp NTP
packets.

And even if there were, the next question is how to get the measured
latency compensation parameters to the client without breaking the
existing protocol?

Maybe this would be an interesting approach for NTP v5.


Indeed. When I look at NTP and PTP, I see two protocols that could learn 
a lot from each other. NTP got some things (more or less) right that PTP 
is bad at, and vice versa.


Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-20 Thread Magnus Danielson


On 19/03/14 10:50, Miroslav Lichvar wrote:

On Tue, Mar 18, 2014 at 10:20:08PM +0100, Magnus Danielson wrote:

No, it's not. NTP is being perceived to be software timestamping
but nothing prohibits you from doing it in hardware. Similarly can
you implement PTP with software time-stamping (with shitty
performance).

Doing HNTP makes NTP match up against PTPv1 to some degree, but PTP
then pulls out the explicit means to make PTP-aware transparent
clocks to correct for delays, cancelling some of the asymmetry. You
could do NTP with PTP 2-step processing, but what we would call such
a bastard would be an interesting thing, NPTP?


There is already a two step mode implemented in ntpd that works with
NTP peers or broadcast, it's activated by the xleave option.

An NTP transparent clock could be implemented too. One problem is that
with the current protocol it would have to track the connections. For
a stateless operation a new NTP extension field would probably be needed.
Similarly to PTP, all NTP-aware routers and switches between NTP
server and client would increment a path delay correction.



Interesting!

NTP aware routers and switches is probably less common than PTP aware ditos.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-20 Thread Magnus Danielson

On 19/03/14 18:51, E-Mail Sent to this address will be added to the 
BlackLists wrote:

Martin Burnicki wrote:

Magnus Danielson wrote:

Indeed. If you read the right article from 1990 you
  also know you can do it on L1 C/A only by monitoring
  both code and carrier phase, as their ionospheric
  effect have opposite signs.


That's interesting, and I didn't know about this.

Do you have a pointer to this article?


http://www.navipedia.net/index.php/Code-Carrier_Divergence_Effect
http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA497270


There ya' go! It was the Robert Giffard article I was refering to, but 
was too tired to dig up at the time.



http://www.academia.edu/457163/Ionosphere_Effect_Mitigation_for_Single-frequency_Precise_Point_Positioning
http://www.ngs.noaa.gov/PUBS_LIB/GPSCarrierPhase.pdf


Haven't see those two, so I will see if they add something new.

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-20 Thread Magnus Danielson


Hi Joe,

On 20/03/14 01:53, Joe Gwinn wrote:

In article 5328ad2...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


On 18/03/14 01:24, Joe Gwinn wrote:

I've used IRIG-B004 DCLS before, for cables two meters long within a
cabinet.  Worked well.  How well do they handle 100 meter cables, in
areas where the concept of ground can be elusive?


The rising edge of the 100 Hz is your time reference, the falling edges
is your information. Proper signal conditioning and cabling should not
be a problem given proper drivers and receivers.

IRIG-B004 DCLS also travels nicely over optical connections, and
grounding issues will be less of a problem. Known to work well in power
sub-stations, so there can be off the shelf products if you look for them.


That's a pretty severe environment.


I thought it would get your attention.


I should give more context:  On ships at full steam, there can be a
steady seven volts rms or so at power frequency (and harmonics) between
bow and stern, which will cause large currents to flow in the shield.
This is well below the frequency at which inside and outside shield
currents become decoupled due to skin effect, so the full voltage drop
in the shield may be seen on the center conductor.

We use optical links a lot, and triax some.

One can also make RF boxes largely immune with a DC-block capacitor in
series with the center conductor.


Thus, another fairly severe environment.


Maybe, depends on your needs. Consider doing a separate network for PTP.
That approach have been used in systems where you want to make sure it
works.


That fails economically - might as well stick to IRIG.


Indeed. Doing 1 us level might be possible, going lower than that will 
cause you more and more grey hairs one way or another.



This is my fear and instinct.  But people read the adverts and will
continue to ask.  And some customers will demand.  So, I'm digging
deeper.

Are there any good places to start?


You asked here, it's not the worst place to start. :)


To be sure.

There is a truism in the standards world, that it take three major
releases (versions) of a standard for it to achieve maturity.  PTP is
at version 2, so one more to go.


I'd say it depends on for what application. The trouble is when the 
assumed applications increase at a quicker rate than the standard adapts 
to handle them.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Asymmetric Delay and NTP

2014-03-20 Thread Magnus Danielson


Joe,

On 19/03/14 11:55, Joe Gwinn wrote:

In article 5328aaa6.70...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


On 18/03/14 01:36, Joe Gwinn wrote:

In article 5327757e.5040...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


Is that formal enough for you?


It may be.  This I did know, and would seem to suffice, but I recall a
triumphant comment from Dr. Mills in one of his documentation pieces.
Which I cannot recall well enough to find.  It may be the above
analysis that was being referred to, or something else.


I can't recall. The above I came up with myself some 10 years ago or so.



When I awoke the day after writing the above, I saw two problems with
the above analysis.

First is that with added message-exchange volleys, one does not get
added variables and equations, one instead gets repeats of the
equations one already has.  If there is no noise, the added volleys
convey no new information.  If there is noise, multiple volleys allows
one to average random noise out.


True. What does happen over time is:
1) Clocks drift away from each other due to systematics and noises
2) The path delay shifts, sometimes because of physical distance shifts,
but also due to shift of day and season.

These require continuous tracking to handle


Second is that what is proven is that a specific message-exchange
protocol cannot work, not that there is no possible protocol that can
work.


The above analysis only assumes a way to measure some form of signal.
The same equations is valid for TWTFTT as for NTP, PTP or whatever uses 
the two-way time-transfer. What will differ is they way they convey the 
information and the noise-sources they see.



Will see if I can find Dave's reference.


I hit pay dirt yesterday, while searching for data on outliers in 1588
systems.   Dave's reference may well be in the references of the
following article.

Fundamental Limits on Synchronizing Clocks Over Networks, Nikolaos M.
Freris, Scott R. Graham, and P. R. Kumar, IEEE Trans on Automatic
Control, v.56, n.6, June 2011, pages 1352-1364.


Sounds like an interesting article. Always interesting to see different 
peoples view of fundamental limits.



I also took the next step, which is to treat d_AB and d_BA as random
variables with differing means and variances (due to interference from
asymmetrical background traffic), and trace this to the effect on clock
sync.  It isn't pretty on anything like a nanosecond scale.  The
required level of isolation between PTP traffic and background traffic
is quite stringent.


It's even worse when you get into packet networks, as the delays contain
noise sources of variable mean and variable deviation, besides being
asymmetrical. NTP combats some of that, but doesn't get deep enough due
to too low packet rate. PTP may do it, but it's not in the standard so
it will be propritary algorithms. The PTP standard is a protocol
framework. ITU have spent time to fill in more of the empty spots.


Yes.  In closed networks, the biggest cause of asymmetry I've found is
interference between NTP traffic and heavy background traffic in the
operating system kernels of the hosts running application code.
Another big hitter was background backups via NFS (Network File
System).  The network switches were not the problem.  What greatly
helps is to have a LAN for the heavy applications traffic, and a
different LAN for NTP and the like, forcing different paths in the OS
kernel to be taken.


If you can get your NIC to hardware time-stamp your NTP, you will clean 
things up a lot.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-22 Thread Magnus Danielson


Joe,

On 21/03/14 16:17, Joe Gwinn wrote:

Magnus,


Thus, another fairly severe environment.


I have a personal war story from 1992:  At a Air Traffic Control center
in Canada, one 19 cabinet had the green (safety ground) and white
(power neutral) cables transposed.  This caused 2.3 Vrms at 180 Hz to
appear between the VMEbus ground and the cabinet shell, with enough
oomph to cause a small spark when oscilloscope probe grounding clip was
connected to that VMEbus ground, this causing the system (and my heart)
to crash.  If left connected, the ground clip became warm.  And how can
ground generate a spark, even a small one?  Fixing the grounds dropped
the offset to around ten millivolts.  The 180 Hz arose because the
power supplies were single-phase capacitor-input, driven from the legs
of three phase prime power.


Power neutral isn't really neutral when it takes a lot of beating.
Similarly, a grounding wire isn't doing much grounding as frequency goes up.


That fails economically - might as well stick to IRIG.


Indeed. Doing 1 us level might be possible, going lower than that will
cause you more and more grey hairs one way or another.


Well, now, this could be an advantage -- my hair is already gray, and
more could be better.


Well, you may have younger colleagues which fails to have this 
advantage. I knew you would make the comment. :)



There is a truism in the standards world, that it take three major
releases (versions) of a standard for it to achieve maturity.  PTP is
at version 2, so one more to go.


I'd say it depends on for what application. The trouble is when the
assumed applications increase at a quicker rate than the standard adapts
to handle them.


It does, but having the market grow faster than the standards cycle can
be the mark of success.


To some degree. Being perceived to be a solution isn't the same as it 
being a solution.



By the way, development of the third revision of 1588 started in 2013.
I joined what purported to be their reflector, but now that you mention
it I haven't gotten any traffic -- Something must be wrong.  I will
need to enquire.


They formally had their first session at the ISPCS in Lemgo.

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Asymmetric Delay and NTP

2014-03-23 Thread Magnus Danielson


Joe,

On 23/03/14 23:20, Joe Gwinn wrote:

Magnus,

In article 532e45db.5000...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


Joe,

On 21/03/14 17:04, Joe Gwinn wrote:

[snip]



It is interesting.  I've now read it reasonably closely.

The basic approach is to express each packet flight in a one-line
equation (a row) in a linear-system matrix equation, where the system
matrix (the A in the traditional y=Ax+b formulation, where b is zero in
the absence of noise), where A is 4 columns wide by a variable number
of rows long (one row to a packet flight), and show that one column of
A can always be computed from the two other columns that describe who
is added and subtracted from who.  In other words, these three columns
are linearly dependent on one another.  The forth column contains
measured data.

This dependency means that A is always rank-deficient no matter how
many packets (including infinity) and no matter the order, so the
linear system cannot be solved.


It is just another formulation of the same equations I provided.
For each added link, one unknown and one measure is added.
For each added node, one unknown is added.


True, but there is more.


Let's come back to that.


As you do more measures, you will add information about variations the
delays and time-differences between the nodes, but you will not disclose
the basic offsets.


Also true.  The advantage of the matrix formulation is that one can
then appeal to the vast body of knowledge about matrixes and linear
systems.  It's not that one cannot prove it without the matrixes, it's
that the proof is immediate with them - less work.

And the issue was to prove that no such system could work.


As much as I like matrix formulation, it ain't giving you much more in 
this case, rather than a handy notation. The trouble is that beyond the 
properties of the noise, there is no information leakage about the 
static time-errors and asymmetries. You end up having free variables.


The problem is that the unknown and the relationships builds up in an 
uneven rate, and the observations only relate to two unknowns. The only 
trustworthy fact we get is the sum of the delays, but no real hint about 
it's distribution. If you do more observations along the same paths, you 
can do some statistics, but you won't get un-biased result without 
adding a prior knowledge one way or another. Formulate it as you wish, 
but as you add more observations, those will be reduced to by their 
linear properties to equations existing and noise. You need to add 
observations which does not fully reduce in order for your equation 
system to grow to such size that you can solve it. Show me how you 
achieve it, and I listen.



The no matter the order part comes from the property of linear
systems that permuting the rows and/or columns has no effect, so long
as one is self-consistent.

So far, I have not come up with a refutation of this approach.  Nor
have the automatic control folk - this proof was first published in
2004 into a community that knows their linear systems, and one would
think that someone would have published a critique by now.

The key mathematical issue is if there are message exchange patterns
that cannot be described by a matrix of the assumed pattern.  If not,
the proof is complete.  If yes, more work is required.  So far, I have
not come up with a counter-example.  It takes only one to refute the
proof.


It is only by cheating that you can overcome the limits of the system.


Is GPS cheating?  That's our usual answer, but GPS isn't always
available or possible.


If you are trying to solve it within a network it is. You can convert 
your additional GPS observation into an a prior knowledge, and once you 
done enough of those, then you can solve it completely. The estimated 
variables better stay static thought, or you have to start over again.



Although, if one goes to the trouble to make a NIC PTP-capable, it
wouldn't be so hard to have it recognize and timestamp passing NTP
packets.  The hard part would be figuring out how to transfer this
timestamp data from collection in the NIC to point of use in the NTP
daemon, and standardizing the answer.


The Linux-kernel has such support. NTPD has already some support for
such NICs included.


All true.  But I'm reluctant to recommend a solution that lacks a
common standard and/or has fewer than three credible vendors supporting
that standard.  I have no doubt that these things will come to pass,
but we are not there just yet.


Indeed.

Cheers,
Magnus

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Asymmetric Delay and NTP

2014-03-29 Thread Magnus Danielson


On 24/03/14 14:38, Joe Gwinn wrote:

Magnus,

In article 532fa47b.7060...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


Joe,

On 23/03/14 23:20, Joe Gwinn wrote:

Magnus,

In article 532e45db.5000...@rubidium.dyndns.org, Magnus Danielson
mag...@rubidium.dyndns.org wrote:


Joe,

On 21/03/14 17:04, Joe Gwinn wrote:

[snip]



It is interesting.  I've now read it reasonably closely.

The basic approach is to express each packet flight in a one-line
equation (a row) in a linear-system matrix equation, where the system
matrix (the A in the traditional y=Ax+b formulation, where b is zero in
the absence of noise), where A is 4 columns wide by a variable number
of rows long (one row to a packet flight), and show that one column of
A can always be computed from the two other columns that describe who
is added and subtracted from who.  In other words, these three columns
are linearly dependent on one another.  The forth column contains
measured data.

This dependency means that A is always rank-deficient no matter how
many packets (including infinity) and no matter the order, so the
linear system cannot be solved.


It is just another formulation of the same equations I provided.
For each added link, one unknown and one measure is added.
For each added node, one unknown is added.


True, but there is more.


Let's come back to that.


As you do more measures, you will add information about variations the
delays and time-differences between the nodes, but you will not disclose
the basic offsets.


Also true.  The advantage of the matrix formulation is that one can
then appeal to the vast body of knowledge about matrixes and linear
systems.  It's not that one cannot prove it without the matrixes, it's
that the proof is immediate with them - less work.

And the issue was to prove that no such system could work.


As much as I like matrix formulation, it ain't giving you much more in
this case, rather than a handy notation. The trouble is that beyond the
properties of the noise, there is no information leakage about the
static time-errors and asymmetries. You end up having free variables.


Yes.  You correctly noted the mathematical equivalence of the two
approaches, and I agree.  My point was that the matrix approach is less
work to get to the desired proof because by formulation as a linear
solution with matrixes, one immediately inherits lots of properties and
proofs.



The problem is that the unknown and the relationships builds up in an
uneven rate, and the observations only relate to two unknowns. The only
trustworthy fact we get is the sum of the delays, but no real hint about
its distribution. If you do more observations along the same paths, you
can do some statistics, but you won't get un-biased result without
adding a prior knowledge one way or another. Formulate it as you wish,
but as you add more observations, those will be reduced to by their
linear properties to equations existing and noise. You need to add
observations which does not fully reduce in order for your equation
system to grow to such size that you can solve it.


Yes, this is a good statement of the consequences of the proof.


Thanks.


Show me how you  achieve it, and I listen.


I don't understand the challenge.  There is no dispute.


It's not a personal challenge, it's a wide-spread challenge. If someone 
worked something out I'm really keen to learn about it. I've spent quite 
some time about figuring these things out as I need to understand them.


The *one* thing you can figure out with more measurements is how 
non-zero-mean noise such as network traffic contribute to asymmetry.
You can do pretty good approximations of that contribution. However, if 
there is an underlying asymmetry in static delay sources, they won't 
disclose themselves with more measurements of the set measurements.


What you *can* do is bring a precise time to the first slave, measure 
the time-error, compensate for it and then step-by-step calibrate a 
path. The trouble is that you know adds the a prior assumption of stable 
asymmetry, which may not be true. I've experienced it not to be true.



It is only by cheating that you can overcome the limits of the system.


Is GPS cheating?  That's our usual answer, but GPS isn't always
available or possible.


If you are trying to solve it within a network, it is. You can convert
your additional GPS observation into an a prior knowledge, and once you
done enough of those, then you can solve it completely. The estimated
variables better stay static thought, or you have to start over again.


GPS is the usual answer, but isn't always available or useful.


I know, I know.


Recall that the original question was random asymmetry due to
asymmetric background traffic in a PTP network.  If the network is
controllable, a lab experiment is to simply turn the background traffic
off and see how much the clocks change with respect to one another.

But this tells one how much trouble one

Re: [ntp:questions] Thoughts on KOD

2014-07-05 Thread Magnus Danielson


Harlan,

On 07/05/2014 11:40 PM, Harlan Stenn wrote:

Folks,

I was chatting with PHK about:

  http://support.ntp.org/bin/view/Dev/NtpProtocolResponseMatrix

  http://bugs.ntp.org/show_bug.cgi?id=2367

and how we probably want to extend KOD coverage to more than just the
limited case.

I was assuming folks would want finer-grained control over this
behavior, and thought about being able to choose any of kod-limited,
kod-noserve, and kod-query.

PHK suggested that we consider going the other way - KOD would mean
Send KODs whenever appropriate.

I wonder what the costs/benefits will be when weighing the extra
complexity of multiple choices against when the defaults change and
we get new behavior that we can't tune, that costs us in X and Y.

This gets a bit more complicated when taking into consideration:

- we'll get more traffic from a NAT gateway
- - do we need to be able to configure a threshhold for this case?

- we should pay attention to how a client, whom we find to be abusive,
   reacts to:
- - getting no response
- - getting a KOD response
   and adapt accordingly.

Discussion appreciated.



There is also the aspect when KOD does not bite. We have seen that.
Like other forms of defenses, inserting drop rules into firewall rules 
for the offending node is an alternative to consider. KOD only bites for 
nodes which follows the protocol, but somehow is offending in their 
configuration. More offensive configuration or packet generation will 
render KOD relatively useless.


Thus, there might be a limit on how much effort should be going into 
perfecting KOD-generation when maybe raising the bar even further is needed.


Then, we should also consider how KOD and drop-rule triggering can be 
used to trigger denial of service, and how to potentially protect 
against them.


Sorry for muddling your water even more.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-05 Thread Magnus Danielson


Harlan,

On 07/06/2014 02:18 AM, Harlan Stenn wrote:

Magnus,

Yes, we know that if we decide to track finely-grained behavior we'll
need to watch how {IP,port} responds when getting {no,KOD} responses.


Just want to gently remind you.


We might just want a syslog entry for KOD, because it's clear that there
can come a time when we don't want to rely on the remote side doing
anything.

Unless there is a better solution.  I like the syslog idea because we
can tag it and let other mechanisms decide what to do with that raw
information.


For that purpose it may be good to allow for a separate log for sent KOD 
messages, besides properly log to syslog. A script or program can then 
monitor it for updates and insert rules, without having to filter the 
syslog.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-06 Thread Magnus Danielson




On 07/06/2014 12:38 PM, Terje Mathisen wrote:

Rob wrote:

Harlan Stenn st...@ntp.org wrote:

Discussion appreciated.


I think it is best to remove KOD from ntpd.
It does not serve a useful purpose, because precisely the kind of
clients that you want to say goodbye to, do not support it.

In real life it has either no effect at all, or it even has a negative
effect because the client does not understand it and re-tries the
request sooner than it would when no reply was sent at all.


I'm afraid this is exactly right:

KOD is a way to keep honest guys honest, i.e. it only helps against
programmers/users why actually try (hard) to do the right thing.

Currently it will cause a badly configured ntpd installation (burst +
minpoll 4 + maxpoll 4) to possibly stop using any server which sends
back KOD, but only if it also uses the pool directive to actively search
out the best servers.


Maybe it's time to figure out how to auto-tune configurations as a 
better alternative than people keep following aged advice. In the 
meanwhile, make sure that good concrete advice with a section of don't 
do this anymore is on ntp.org.



I don't want to think about users actively trying to generate as much
traffic as possible. :-(


Unfortunately we need to. The use of NTP features as accelerator in DDOS 
attack happen this spring. We had to turn of nice features, which in 
itself becomes a form of DOS. If we rather had ways to protect a server 
(remember that clients also act as servers) so that proper use does not 
cause loss of service, but aggressive use cause block-out. Soft-state 
remembering signaling peers for some time and then forget them to keep 
statistics of packets per time-period, and if the signaling peer acts 
reasonably well it is stays, overtransmitting packets will cause 
black-listing. KOD is the least, but inserting drop rules into the local 
host should follow, and possibly push the block rule into the network to 
clear off the machine and part of the network with the offending traffic.


For cases like that, KOD won't help at all.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-06 Thread Magnus Danielson


Detha,

On 07/06/2014 03:23 PM, detha wrote:

On Sun, 06 Jul 2014 12:28:09 +, Magnus Danielson wrote:

For cases like that, KOD won't help at all.



All the state table/KOD/filter rules mitigation approaches I have seen so
far are limited to one server. Maybe time to take a look at a DNSBL-type
approach for abusive clients; that way, once a client is labeled
'abusive', it will stop working with any of the pool servers that use the
blacklist.

Policies (for how long to auto-blacklist, how to prevent DoS by
blacklisting the competition, how to 'promise to behave and
express-delist' etc.) to be defined.


Maybe. For the moment I think it is sufficient if we provide a mechanism 
by which offenders gets reported to *some* system.
We *could* also provide a method by which white/black-list can be 
dynamically set from an external source, so enough hooks exists, but I 
do not think that NTPD should be burned with the rest of that system.


Once NTPD can report it feels offended by a source, and beyond KODing it 
also report to some external mechanism that could potentially block it 
by any external means, NTPD does not have to do much more.


My point being with this line of thinking is that KOD in itself makes 
assumptions on the offending source actually respects it, and while KOD 
rules probably can be improved, it does not provide a very effective 
means of protection with sources not respecting KOD, and thus we also 
needs to think i broader terms.


In my mind, the defenses is according to these lines:

0) NTPD tolerates a source, packet approval checks
1) NTPD does not tolerate a source, fires of KOD, source is expected to 
shut up
2) NTPD admin does not tolerate a source, blacklist it, NTPD will drop 
the traffic

3) NTPD admin does not tolerate a source, filters in in box firewall,
box firewall drops the traffic
4) NTPD admin does not tolerate a source, filters in network firewall,
network firewall drops the traffic

Notice how step 2-4 moves the traffic load further away from NTPD 
process, interface and eventually subnetwork. What I proposed would 
allow for automation of these steps.


It is reasonable that escalation should be done when a source does not 
respect KOD and keeps transmitting requests.


It is also resonable that blocking times out, such that blocking is 
removed after some reasonable time, as offenders can be on dynamic 
addresses, and usually works over limited time when intentional.


How to automate step 2-4 is however not a core concern for NTPD, but 
feeding the data out of NTPD in a way that is handy for such a mechanism 
is. Separate block-log file as I proposed is probably better than only a 
syslog file, as it removes the need to parse syslog for matching blocks, 
but rather can focus on changes in a dedicated file.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-07 Thread Magnus Danielson


Danny,

On 07/07/2014 04:00 PM, Danny Mayer wrote:

On 7/6/2014 2:42 AM, Rob wrote:

Harlan Stenn st...@ntp.org wrote:

Discussion appreciated.


I think it is best to remove KOD from ntpd.
It does not serve a useful purpose, because precisely the kind of
clients that you want to say goodbye to, do not support it.

In real life it has either no effect at all, or it even has a negative
effect because the client does not understand it and re-tries the
request sooner than it would when no reply was sent at all.


You haven't read the code. Any client that ignores the KOD flag will
find (if they ever looked) that their clock will be drifting away
further and further from the proper time. When KOD is set the value of
the received and sent timestamps are the same as the initial client sent
timestamp. It doesn't use the system time for the returned packet.
Calculate what this does to the resulting clock.

Please also note that there is more than one type of KOD packet. See RFC
5905 Section 7.4. See also Figure 13. You need to clearly distinguish
the different ones when talking about them. Most of this discussion
seems to be about action a. As discussed above this is an extremely
useful feature because any client ignoring the KOD flag and using the
packet any way will get pushed way of the actual time that they would
normally expect regardless of the client software used.


Which would make sense if the client has multiple sources and is a 
relatively decent NTP client. Issues we have seen is outside of the NTP 
client realm.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-07 Thread Magnus Danielson




On 07/07/2014 04:10 PM, Danny Mayer wrote:
 The experience with blocking has actually being negative and we have

seen traffic actually INCREASE after it is blocked because the client,
not having received a response, tries more often. This has been observed
in the wild.


This might be true for proper NTP clients, but I wonder if this is true 
for faked NTP requests from DDOSers. KOD fills no purpose for DDOSers, 
so massive attacks is best handled by dropping that traffic, and 
possibly push the dropping away from the node and subnet running the 
server. For more modest overload scenarios as miss-configured or 
otherwise error-ed NTP clients, I believe that what you describe is correct.


Let's not confuse these different scenarios, as they most probably have 
different solutions. My point was that DDOS amplification/relaying 
should be considered, as we need that solved, while KOD refinements is 
maybe nice but addresses another problem.


I don't think you will be able to handle the DDOS issues without doing 
blocking, and you want that blocking to move away from your server in 
order to reduce the impact of the service.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-07 Thread Magnus Danielson




On 07/07/2014 04:50 PM, Majdi S. Abbas wrote:

On Mon, Jul 07, 2014 at 08:35:25AM +0100, David Taylor wrote:

Seconded.


Why remove KOD when it has to be expressly enabled (via
restrict kod and limited)?

I'd rather see a two tier system, where you can enable
the use of KOD beyond the initial rate limit, and a second limit
beyond which requests are simply ignored.

But I don't understand why anyone would remove functionality
that the server administrator has to expressly configure to enable.


I think KOD is fine for it's intended purpose, but it does not solve 
this other problem we are having. Thus, a two-tire solution is what I 
advocate for.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Thoughts on KOD

2014-07-08 Thread Magnus Danielson




On 07/08/2014 12:11 PM, Jason Rabel wrote:

There are two obvious ways to go for an embedded client.

One way would be to use the sntp code as the base.

The other would be to either use the current NTP codebase and use the
configure options to disable all the refclocks and anything else you
didn't want, or wait until we're done with the post-4.2.8 rewrite.  For
post-4.2.8, we're looking at having a client core with any refclock
code being handled a separate process.


I do not know if this is the case with NTP, but quite often it takes
considerable hacking of sources to get code to compile on non-x86
embedded hardware (i.e. ARM  MIPS)... It would probably help boost usage
if someone was assuring NTP sources compile on those platforms without
the need for modification.


You need to gift wrap it considerable, such that the proliferation of 
bad-hacked NTPish code gets replaced. Putting a price-tag on it mean 
that it will prohibit the shift of code, which in itself is a cost.
Hobby-hackers already do many first breaks, so why not make sure that 
their contributions make it into the code such that support for a large 
range of embedded platforms exists either directly in NTPD or easily 
accessible port.


Another thought is to have people review the NTP/SNTPish code that is 
out there to see how their complience are, what KOD they would react to 
and how much effort it would to fix the basics.


Then again, the basic problem is that people doesn't upgrade their FW as 
they should.


Listing of implementations, their target environments and versions to 
use and versions to avoid should maybe be of assistance.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP Autokey - who is actively using it?

2015-01-15 Thread Magnus Danielson


Hi,

On 01/15/2015 03:06 AM, Harlan Stenn wrote:

I'm trying to figure out if anybody is actively using autokey, in a
production deployment.

If you are, please let me know - I have some questions for you.



We use it to pull leap-second info off the NTP servers.

It took some effort to get it running, and well, it hasn't been painless 
but we now got the process debugged anyway.


Did the authkey-less distribution of leap-second info ever got 
implemented? I do know there was an I-D essentially interprenting the 
existing RFCs in such a way.


What kind of questions do you have?

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Writing the drift file

2015-03-07 Thread Magnus Danielson


Harlan,

On 03/06/2015 10:35 AM, Harlan Stenn wrote:

Folks,

A while ago we got a request from the embedded folks asking for a way to
limit the writing of the drift file unless there was a big enough
change in the value to warrant this.

Somebody came up with an interesting way to do this that involves
looking to see how much the drift value has changed and only writing the
value if the change was big enough.

By my read of the code and the comments:

1) it looks like the code is implementing something other  than what the
comments want, and

2) what's described *or* implemented seems way more complicated than
what we need.

I'm wondering if we should just let folks specify a drift/wander
threshold, and if the current value is more than that amount we write
the file, and if the current value is less than that amount we don't
bother updating the file.  If folks are on a filesystem where the number
of writes doesn't matter, no value would be set (or we could use 0.0)
and it's not an issue.

Thoughts?



For embedded systems, reducing the rate of writing make sense, so in 
this regard the question is valid IMHO.


There can't be a universal limit that will once and for all satisfy all 
needs, so some user configurability would make sense.


Another aspect is that if a drift-file exists, NTP believes it and skip 
the frequency track-in phase and does not resolve the resulting constant 
reset of time that will be a result of not being able to track in the 
difference between the given offset and actual offset, if it is too 
large (this addresses Jochen's 3rd point about worst case scenario). I 
discovered this the hard way some years back, but I think it never got 
resolved. Anyway, as long as this misfeature remains, lowering the 
drift-write rate could lead to subtle misconfiguration issues worse than 
existing drift-file issues. This would limit the range of values users 
could configure and it would also end up expose the existing misfeature 
since we need to make sure that there is a very high likelihood that the 
drift-value written is within the capture range of the PLL mode, even 
when written more seldom.


Trying to modify the way NTP write (Joochen's point 1) does not help for 
flashes, it's the write itself which may be long-term destruction, so it 
is reducing the rate of write which should be the key. It might be 
better to do the write as part of shut-down.


In general, the drift-file handling should be made more fool-proof 
first, before attempting to improve the write rate issue. It has already 
proven itself causing a number for problems, but rather than addressing 
it directly, trying to address triggers seems to be the main problem. 
Until this is done, the drift-file can be a beneficial accelerator for 
systems where it works, but may be discouraged for other system uses. 
It's main motivation is to overcome the FLL phase time of NTP start-up.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Writing the drift file

2015-03-07 Thread Magnus Danielson


Harlan,

On 03/07/2015 12:12 PM, Harlan Stenn wrote:

OK, a fair amount of good stuff is being discussed.

Do we mostly all agree that the purpose of the drift file is to give
ntpd a hint as to the frequency drift at startup?

If so...

The current mechanism is designed to handle the case where ntpd is
restarted fairly quickly, so there's a good chance the same drift value
will work.


Remember that for embedded devices, the operational conditions may be 
such that it's not a quick restart at all times. You cannot assume and 
know when it will go up the next time after being powered off. It can be 
minutes, hours, weeks, years.


Just like the leap-second file, the time of the drift-file is relevant, 
and if it is too old, it is one (of several) reasons to disqualify it.


Relying on the time-stamp of the file can be trouble-some, as it may not 
be respected by file-management. Writing it into the file is more robust.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP 4.2.8p10 released

2017-03-24 Thread Magnus Danielson


Hi,

I asked my original question to see if the process could be improved.

For instance, it would be good if there was direct pointers to the 
various distributions page where packaging state and availability for 
various releases would be shown and updates. This is what can be found 
for several other security bug scenarios. Especially for security 
updates, it is more important for coordinating the efforts and help 
showing how this is handed off. While each party have their 
responsibility, they way things coordinate can be important for how 
smooth the upgrade becomes for the users. I wanted to just ask about it 
to see if there is improvements to be done.


For the moment, I don't have 4.2.8p10 available in my version of distro 
(Debian testing). Packaging is important to many, so this is to 
illustrate possible improvements.


MVH
Magnus

On 03/24/2017 09:30 AM, Harlan Stenn wrote:

Hal,

Thanks - we're going over the lists and processes to improve things.

H
--
Hal Murray writes:


Harlan said:

We're open to doing an even better job of telling folks about things like
this.


I think a message should go out to (almost?) all lists when security fixes
are available to the general public.

The first mail on questions was David Taylor's announcement of the
availability of Windows binaries.  There was no mention of a security release
.

I just checked the archives for announce.  Nothing since April 2015.

The hackers list has only 3 messages in March.  None was an announcement.


If the mailing list traffic has moved to other lists and/or venues, then
please make an official announcement and disable the old lists.  (but please
save the archives)


--
These are my opinions.  I hate spam.





___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP 4.2.8p10 released

2017-03-23 Thread Magnus Danielson


Hi,

On 03/23/2017 05:12 PM, Martin Burnicki wrote:

Magnus Danielson schrieb:

Hi Martin,

On 03/23/2017 03:25 PM, Martin Burnicki wrote:

Hi Magnus,

Magnus Danielson wrote:

Hi,

On 03/22/2017 02:27 PM, David Taylor wrote:

NTP 4.2.8p10 released

Windows binaries working on Windows-XP SP3 & later - download:
http://www.satsignal.eu/ntp/x86/index.html

Source: http://archive.ntp.org/ntp4/ntp-4.2/ntp-4.2.8p10.tar.gz



Is bugs generated in distros?
Debian for instance.


What do you mean? I don't understand your question.


In particular for security updates it would be good to do coordinated
filing of a tracking bug for various Linux distributions, such that
they upgrade their packaging quickly. Exactly who does what may vary,
but with some form of orchestrated handling upgrading may become
quicker.


The folks who are in the security group knew about the upcoming security
release, but I agree an email should have been sent on the normal NTP
announce mailing list as soon as the new version became available..


It took some time to find:

https://packages.qa.debian.org/n/ntp.html

4.2.8p10 now in unstable. Hope it ripples out to the others soon.

My point here is that we like to make sure the progress is there and 
easy to evaluate when an update is possible.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP 4.2.8p10 released

2017-03-23 Thread Magnus Danielson


Hi,

On 03/22/2017 02:27 PM, David Taylor wrote:

NTP 4.2.8p10 released

Windows binaries working on Windows-XP SP3 & later - download:
http://www.satsignal.eu/ntp/x86/index.html

Source: http://archive.ntp.org/ntp4/ntp-4.2/ntp-4.2.8p10.tar.gz



Is bugs generated in distros?
Debian for instance.

Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP 4.2.8p10 released

2017-03-23 Thread Magnus Danielson


Hi Martin,

On 03/23/2017 03:25 PM, Martin Burnicki wrote:

Hi Magnus,

Magnus Danielson wrote:

Hi,

On 03/22/2017 02:27 PM, David Taylor wrote:

NTP 4.2.8p10 released

Windows binaries working on Windows-XP SP3 & later - download:
http://www.satsignal.eu/ntp/x86/index.html

Source: http://archive.ntp.org/ntp4/ntp-4.2/ntp-4.2.8p10.tar.gz



Is bugs generated in distros?
Debian for instance.


What do you mean? I don't understand your question.


In particular for security updates it would be good to do coordinated 
filing of a tracking bug for various Linux distributions, such that they 
upgrade their packaging quickly. Exactly who does what may vary, but 
with some form of orchestrated handling upgrading may become quicker.


Cheers,
Magnus
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

58 matches

Mail list logo