Danny Mayer writes: > On 4/5/2016 3:21 PM, Sharon Goldberg wrote: > > On Tue, Apr 5, 2016 at 3:00 PM, Danny Mayer <[email protected] > > <mailto:[email protected]>> wrote: > > = > > > On 4/5/2016 2:28 PM, Sharon Goldberg wrote: > > > Dear WG, > > > > > > To follow up on my comments on draft-stenn-ntp-suggest-refid-00 at = > the > > > IETF'95 WG meeting just now. The current draft requires the use of = > an > > > extension field.=C3=82=C2 I believe the goals of the draft can be > > accomplished > > > without using an extension field, in a backwards compatible fashion. > > > > > > The goal of the draft is to limit the information exposed by the RE= > FID > > > while still preserving robustness to "length-1" timing loops where > > > system A takes time from system B, but system B takes time from > > system A. =C3=82 > > > This proposal allow system A to limit the info it leaks in its refI= > D, > > > without harming any of its legacy clients.=C3=82 > > > > > > Suppose system A is taking time from system B. Then there are two c= > ases: > > > 1) If A gets a time query from system B, A puts the IP of B in the = > refID > > > of its response. This way, even a legacy B can tell it cannot take = > time > > > from A because this would cause a timing loop. > > > > > > 2) If A gets a time query from system C, A puts a "nonsense" value = > in > > > its refID.=C3=82=C2 Even a legacy C can see that its IP is not in = > the > > refID, > > > and so it is allowed to take time from A.=C3=82 > > > > > > One question is what this "nonsense" value should be.=C3=82=C2 I t= > hink it > > > should be a fixed value. For example 0.0.0.0.=C3=82=C2 We would no= > t want a > > > randomly-chosen value since this might collide with actual IP addre= > sses > > > on the network. > > > > > > Thanks, > > > Sharon > > = > > > The problem that I was alluding to in the jabber room is this. I'll k= > eep > > the problem simple and to one example. > > = > > > A takes its time from B over an IPv4 address. A then gets a time query > > from C over IPv6. How does it know if B and C are the same system or > > different system? A has no information about that. > > = > > > Is this clearer? > > = > > > Yes, it's clear. How does the current ntpd implementation deal with this > > problem? =C2 = > > > = > > Currently it doesn't. One of my proposals is for each system to generate > a RefID and use that for all interfaces. I'm not sure if that works for > all cases or causes other issues.
It *hasn't* been an issue. It would be easy enough to address it if somebody felt there was a need to, however. The point is that with your proposal, Sharon, I'm not seeing that it is possible to address this case. That is separate from whether or not it's a problem that needs to be addressed, now or in the "near" future. It's clear that if A has multiple IPs on it (very common now, not common 20 years' ago) that A could check all of its IPs when it saw B's refid and decide if any of them meant B was sync'd to A. In reality, we have only seen that if A sends a packet to B, B responds to that same address on that same IP. We have had cases where that response has come over a different interface. There are other possiblities, though. Here are more scenarios. A VM host maintains the primary clock, and "guests" use that clock. In this case, if A and B are on the same VM host, their underlying system clock is actually the same one. And that assumes one (common) scenario - the VMs are likely set up so that all the VMs have tightly-synchronized time. There's another use case I've been looking for - a VM setup where I can configure each VM to have a clock that behaves independently, so I can test wider behaviors more easily. Is it sufficient to check for loops of degree 1? Exactly what problems are we solving here? What about the situation where a single refclock is directly feeding multiple servers? If A and B are getting their time from the same refclock, should that mean A and B should not fall back on each other if one of them loses signal to their shared refclock? If A and B are talking to the same type of refclock is that something that should be addressed? There have been cases where broken firmware has caused a family of refclocks to all take simultaneous leaps to the same spot in the weeds. If A and B are both using GPS (for example) and there is a system glitch where GPS time is affected, is that also a timing loop that we'd want to avoid? This happened in Jan 2016. What about a group of systems that share a PPS signal? The PPS signal might become untrustworthy, or the "source of numbered seconds" might break. Many of these issues are currently(!) insignificant in a LAN setting where we're looking at timing/frequency sync to around the millisecond level (roughly). I don't *know*, but I'd bet they are significant when trying to get to the microsecond level, let alone the nanosecond level. Hello PTP. Hello RADclock and the work PHK is doing with the algorithms he's using on Ntimed. Once we learn more about these, we need to make sure these different choices will "play nice" with each other. Similarly, chrony uses algorithms that are slightly different from the ones NTP uses. We haven't seen studies of any potential loop instabilities between these two approaches, either. I mention this because 20 years' ago we thought we were doing well getting clocks to sync to within a tenth of a second. Now, getting sub-millisecond sync on a LAN is common. I've spoken with folks who told me that with some care and support, they are getting sync levels with NTP that are very close to what they get with PTP - down to the range of nanoseconds. Where will we be 10 years' time from now? I recall PHK saying that with PTP (or PTP-like stuff), he was able to get offset and frequency sync on cold start in less than a second. He was doing this with a poll interval of about -9, or about 500 polls/second. I know DLM was solving a real and evidenced problem with the refid and loop detection. It's looking like we need to take a stronger look at this to decide if those problems are still significant, and if different problems are becoming significant. -- Harlan Stenn <[email protected]> http://networktimefoundation.org - be a member! _______________________________________________ TICTOC mailing list [email protected] https://www.ietf.org/mailman/listinfo/tictoc
