Re: [ntp:questions] Idea to improve ntpd accuracy
On 2/25/2016 10:29 PM, Weber wrote: > Thanks for the link. I'm not surprised that someone else already had the > idea. I was poking around and see that 1588 does something similar with > a "follow up" packet. > I should have mentioned that this checksum trailer extension field is needed to get the idea to work for NTP. Unfortunately you can't have security or a MAC to do this as it needs to make changes to the timestamps and then add the extension field to fix the UDP checksum. Tal can say more. Danny > > On 2/25/2016 7:09 PM, Danny Mayer wrote: >> Already thought of so it's a good idea! See >> https://datatracker.ietf.org/doc/draft-ietf-ntp-checksum-trailer/ for >> details. >> >> Danny >> >> On 2/25/2016 4:52 PM, Weber wrote: >>> This may or may not be worthwhile, but I thought I'd throw it out there >>> and see what happens: >>> >>> Recent work testing some microsecond-accurate NTP servers lead me to an >>> idea that could improve accuracy of measurements made by ntpd. These NTP >>> servers have hardware timestamps on receive but that's not possible on >>> transmit w/o a custom NIC. I've seen this issue discussed before. >>> >>> The next best thing is to generate the transmit timestamp based on a >>> guess as to how long it takes the NIC to get on the wire and send the >>> packet. That works pretty well as long as there's no other network >>> traffic. In this situation, it is possible to make use of microsecond >>> accuracy in an NTP server. >>> >>> Now, add some typical network traffic and the time it takes the NIC to >>> get on the wire becomes unpredictable to the tune of 200us or more (for >>> 100 base-T Ethernet). The server's microsecond accuracy is largely lost >>> in the noise. >>> >>> The NIC generates an interrupt after the packet is sent which can result >>> in a fairly accurate trailing hardstamp. The problem is...the packet is >>> already gone and has the wrong transmit timestamp. >>> >>> Here's my idea: >>> >>> What if the poll response packet contained a flag or indication of some >>> sort which means "this is an approximate transmit timestamp". That >>> packet would then be immediately followed by a second response packet >>> with a more accurate transmit time. The second packet could be otherwise >>> identical to the first, or it could be a new flavor of packet that >>> contained only the transmit time (that would save on network bandwidth). >>> >>> The ntpd process would need to use the receive time of the first packet >>> (the one with an approximate tx timestamp) and merge in the following >>> accurate tx timestamp before performing the normal processing associated >>> with a poll response. >>> >>> Here are the pros and cons I can think of: >>> >>> Pros >>> >>> * Possible accuracy improvement of 1-2 orders of magnitude. I know ntpd >>> already does some work to try and filter out network delay variation so >>> the improvement might not be a full 2 orders of magnitude. >>> * Could potentially be made compatible backwards compatible with ntp 3/4 >>> protocols >>> >>> Cons >>> >>> * Increased network traffic >>> * Improvement to that level of accuracy might not be of interest to >>> anyone >>> * Could be a fair bit of work for at least a couple of folks >>> * I may have (or probably) missed some stuff regarding network behavior >>> that would reduce the level of improvement that could be realized. >>> * Perhaps this is less of an issue on G-bit Ethernet? >>> >>> Wondering if anyone thinks this idea is worth pursuing further...? >> >> > ___ > questions mailing list > questions@lists.ntp.org > http://lists.ntp.org/listinfo/questions > > ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Idea to improve ntpd accuracy
Thanks for the link. I'm not surprised that someone else already had the idea. I was poking around and see that 1588 does something similar with a "follow up" packet. On 2/25/2016 7:09 PM, Danny Mayer wrote: Already thought of so it's a good idea! See https://datatracker.ietf.org/doc/draft-ietf-ntp-checksum-trailer/ for details. Danny On 2/25/2016 4:52 PM, Weber wrote: This may or may not be worthwhile, but I thought I'd throw it out there and see what happens: Recent work testing some microsecond-accurate NTP servers lead me to an idea that could improve accuracy of measurements made by ntpd. These NTP servers have hardware timestamps on receive but that's not possible on transmit w/o a custom NIC. I've seen this issue discussed before. The next best thing is to generate the transmit timestamp based on a guess as to how long it takes the NIC to get on the wire and send the packet. That works pretty well as long as there's no other network traffic. In this situation, it is possible to make use of microsecond accuracy in an NTP server. Now, add some typical network traffic and the time it takes the NIC to get on the wire becomes unpredictable to the tune of 200us or more (for 100 base-T Ethernet). The server's microsecond accuracy is largely lost in the noise. The NIC generates an interrupt after the packet is sent which can result in a fairly accurate trailing hardstamp. The problem is...the packet is already gone and has the wrong transmit timestamp. Here's my idea: What if the poll response packet contained a flag or indication of some sort which means "this is an approximate transmit timestamp". That packet would then be immediately followed by a second response packet with a more accurate transmit time. The second packet could be otherwise identical to the first, or it could be a new flavor of packet that contained only the transmit time (that would save on network bandwidth). The ntpd process would need to use the receive time of the first packet (the one with an approximate tx timestamp) and merge in the following accurate tx timestamp before performing the normal processing associated with a poll response. Here are the pros and cons I can think of: Pros * Possible accuracy improvement of 1-2 orders of magnitude. I know ntpd already does some work to try and filter out network delay variation so the improvement might not be a full 2 orders of magnitude. * Could potentially be made compatible backwards compatible with ntp 3/4 protocols Cons * Increased network traffic * Improvement to that level of accuracy might not be of interest to anyone * Could be a fair bit of work for at least a couple of folks * I may have (or probably) missed some stuff regarding network behavior that would reduce the level of improvement that could be realized. * Perhaps this is less of an issue on G-bit Ethernet? Wondering if anyone thinks this idea is worth pursuing further...? ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Idea to improve ntpd accuracy
Already thought of so it's a good idea! See https://datatracker.ietf.org/doc/draft-ietf-ntp-checksum-trailer/ for details. Danny On 2/25/2016 4:52 PM, Weber wrote: > This may or may not be worthwhile, but I thought I'd throw it out there > and see what happens: > > Recent work testing some microsecond-accurate NTP servers lead me to an > idea that could improve accuracy of measurements made by ntpd. These NTP > servers have hardware timestamps on receive but that's not possible on > transmit w/o a custom NIC. I've seen this issue discussed before. > > The next best thing is to generate the transmit timestamp based on a > guess as to how long it takes the NIC to get on the wire and send the > packet. That works pretty well as long as there's no other network > traffic. In this situation, it is possible to make use of microsecond > accuracy in an NTP server. > > Now, add some typical network traffic and the time it takes the NIC to > get on the wire becomes unpredictable to the tune of 200us or more (for > 100 base-T Ethernet). The server's microsecond accuracy is largely lost > in the noise. > > The NIC generates an interrupt after the packet is sent which can result > in a fairly accurate trailing hardstamp. The problem is...the packet is > already gone and has the wrong transmit timestamp. > > Here's my idea: > > What if the poll response packet contained a flag or indication of some > sort which means "this is an approximate transmit timestamp". That > packet would then be immediately followed by a second response packet > with a more accurate transmit time. The second packet could be otherwise > identical to the first, or it could be a new flavor of packet that > contained only the transmit time (that would save on network bandwidth). > > The ntpd process would need to use the receive time of the first packet > (the one with an approximate tx timestamp) and merge in the following > accurate tx timestamp before performing the normal processing associated > with a poll response. > > Here are the pros and cons I can think of: > > Pros > > * Possible accuracy improvement of 1-2 orders of magnitude. I know ntpd > already does some work to try and filter out network delay variation so > the improvement might not be a full 2 orders of magnitude. > * Could potentially be made compatible backwards compatible with ntp 3/4 > protocols > > Cons > > * Increased network traffic > * Improvement to that level of accuracy might not be of interest to anyone > * Could be a fair bit of work for at least a couple of folks > * I may have (or probably) missed some stuff regarding network behavior > that would reduce the level of improvement that could be realized. > * Perhaps this is less of an issue on G-bit Ethernet? > > Wondering if anyone thinks this idea is worth pursuing further...? ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
[ntp:questions] Idea to improve ntpd accuracy
This may or may not be worthwhile, but I thought I'd throw it out there and see what happens: Recent work testing some microsecond-accurate NTP servers lead me to an idea that could improve accuracy of measurements made by ntpd. These NTP servers have hardware timestamps on receive but that's not possible on transmit w/o a custom NIC. I've seen this issue discussed before. The next best thing is to generate the transmit timestamp based on a guess as to how long it takes the NIC to get on the wire and send the packet. That works pretty well as long as there's no other network traffic. In this situation, it is possible to make use of microsecond accuracy in an NTP server. Now, add some typical network traffic and the time it takes the NIC to get on the wire becomes unpredictable to the tune of 200us or more (for 100 base-T Ethernet). The server's microsecond accuracy is largely lost in the noise. The NIC generates an interrupt after the packet is sent which can result in a fairly accurate trailing hardstamp. The problem is...the packet is already gone and has the wrong transmit timestamp. Here's my idea: What if the poll response packet contained a flag or indication of some sort which means "this is an approximate transmit timestamp". That packet would then be immediately followed by a second response packet with a more accurate transmit time. The second packet could be otherwise identical to the first, or it could be a new flavor of packet that contained only the transmit time (that would save on network bandwidth). The ntpd process would need to use the receive time of the first packet (the one with an approximate tx timestamp) and merge in the following accurate tx timestamp before performing the normal processing associated with a poll response. Here are the pros and cons I can think of: Pros * Possible accuracy improvement of 1-2 orders of magnitude. I know ntpd already does some work to try and filter out network delay variation so the improvement might not be a full 2 orders of magnitude. * Could potentially be made compatible backwards compatible with ntp 3/4 protocols Cons * Increased network traffic * Improvement to that level of accuracy might not be of interest to anyone * Could be a fair bit of work for at least a couple of folks * I may have (or probably) missed some stuff regarding network behavior that would reduce the level of improvement that could be realized. * Perhaps this is less of an issue on G-bit Ethernet? Wondering if anyone thinks this idea is worth pursuing further...? ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
[ntp:questions] Fwd: Re: ntpd sensitivity to ordering of servers in ntp.conf?
Miroslav, You are a wise man, sir! That is exactly what was happening. I tried adding a new first server entry in ntp.conf -- for a non-existent IP on the same subnet. Whatever was going to sleep or timing out during the delay between polls got refreshed when ntpd tried to contact the first (non-existent) server. Now the delay and offset to the two NTP servers agree almost perfectly. I'll need to run this for several hours to be sure but it is probably good to a couple of microseconds or better. Based on your suggestion, I have to conclude that the cause is probably more closely tied to Linux and/or PC hardware than it is ntpd. Thanks for the idea! On Thu, Feb 25, 2016 at 02:49:48AM -0800, Weber wrote: ntp.conf is specifies both servers with minpoll 4/maxpoll 4. Peer and loop statistics are enabled. By just changing the order of servers in ntp.conf the delay and offset values in peerstats are swapped. Now it is A with 60us delay and B has 85us. Similarly, A's offset is not -5us and B is showing +5us. It appears there is something in ntpd where measurements on server A in ntp.conf come out slightly different depending its ordering in ntp.conf. When A is specified as first in the config, the interval between polling of A and B will be 1 second and the interval between B and A will be 15 seconds. When you swap the servers, the intervals will be swapped too. I think there could be a lot of things than would happen in 15 seconds, but not in 1 second. Maybe some power saving feature is activated or maybe some cache entry expires. You could try adding B manually via ntpq -c config:, timing the command so that the polling is exactly between two polls of B, and see what happens with the delays. Or you could run ping against the servers to keep the link "up". The direction in which the offset changed suggests it's the processing of the server packet that has the extra delay. -- Miroslav Lichvar ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] ntpd sensitivity to ordering of servers in ntp.conf?
On Thu, Feb 25, 2016 at 02:49:48AM -0800, Weber wrote: > ntp.conf is specifies both servers with minpoll 4/maxpoll 4. Peer and loop > statistics are enabled. > By just changing the order of servers in ntp.conf the delay and offset > values in peerstats are swapped. Now it is A with 60us delay and B has 85us. > Similarly, A's offset is not -5us and B is showing +5us. > > It appears there is something in ntpd where measurements on server A in > ntp.conf come out slightly different depending its ordering in ntp.conf. When A is specified as first in the config, the interval between polling of A and B will be 1 second and the interval between B and A will be 15 seconds. When you swap the servers, the intervals will be swapped too. I think there could be a lot of things than would happen in 15 seconds, but not in 1 second. Maybe some power saving feature is activated or maybe some cache entry expires. You could try adding B manually via ntpq -c config:, timing the command so that the polling is exactly between two polls of B, and see what happens with the delays. Or you could run ping against the servers to keep the link "up". The direction in which the offset changed suggests it's the processing of the server packet that has the extra delay. -- Miroslav Lichvar ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
[ntp:questions] ntpd sensitivity to ordering of servers in ntp.conf?
Bored? Need something to do? You could try helping me with this puzzle regarding ntpd. This is a rather long message, so if you're otherwise busy I won't be offended if you skip it. -- Executive Summary -- The delay and offset values measured by ntpd appear to change slightly when the order of two servers is swapped in ntp.conf. -- Introduction -- I'm working to verify the performance of two NTP servers I've built, and have run into some curious behavior with ntpd on a current release of Fedora. This is regarding a few 10's of microseconds at most and is certainly not a serious problem. -- The NTP Servers -- The two NTP servers being tested are embedded systems (Arduino-based) and each has an independent GPS reference clock. One uses an HP55300A while the other uses a Trimble timing-specific GPS module (ICM SMT 360). There are hardstamps on received packets and (oscilloscope-calibrated) drivestamps for transmit. I suspect these servers are accurate to something on the order of a microsecond or so and am trying to get some verification of this at the network level. This estimate of accuracy only applies when the NIC can get onto the wire w/o delay. When there is other network traffic I can routinely measure variable delays on the order of a hundred microseconds and more. I realize this sort of accuracy is not perhaps all that useful given the variable delays that occur in a typical loaded network. On the other hand, by placing one of these servers on an unused Ethernet port it could be possible to get reference-clock accuracy without a hardware-connected reference clock. And well, like climbing a mountain...just because it's there. -- Test Setup -- The NTP client is a 32-bit Pentium-class PC (an old one with IDE disks) running Fedora 23. I believe it has kernel discipline and probably the nano kernel but am not positive about that. The kernel build is 4.2.3-300.fc23.i686. The PC and two NTP servers are connected to a 100 base-T network hub. The hub is not connected to any other networks. ntp.conf is specifies both servers with minpoll 4/maxpoll 4. Peer and loop statistics are enabled. -- loopstats -- Both offset and frequency offset are very stable in loopstats (as long as the CPU temperature stays constant). Over a 6-hour test run, the mean value of offset was -0.63us and the standard deviation was 5.5us. Frequency jitter hovered around 5ppb with occasional jumps up to about 10ppb. As I understand it, this is about as good as it gets, even with a good local ref clock. -- The Puzzle -- The curious part is in peerstats. Server A shows a delay averaging around 85us and server B is running around 60us. Offsets are also different, but by about half as much. Offset on A is about +5us and -5us for B. I was hoping to see closer to identical delays and offsets on the two servers. I'm fairly certain they should be closer than 10us in offset. I tried the following experiments trying to track this down: 1) Swap network hub ports between servers A and B. No change. 2) Cycle power on network hub after (1). No change. 3) Re-check transmit timestamp timing with oscilloscope. Looks good. 4) Change the order of the two servers in ntp.conf. Yikes, that's it! By just changing the order of servers in ntp.conf the delay and offset values in peerstats are swapped. Now it is A with 60us delay and B has 85us. Similarly, A's offset is not -5us and B is showing +5us. It appears there is something in ntpd where measurements on server A in ntp.conf come out slightly different depending its ordering in ntp.conf. So yes, this is really mouse-nuts. It is however setting a bound on how accurate I can conclude the NTP servers are. Any ideas what might be causing this? (I have some data plots and could e-mail them if desired) ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions