Hal Murray <halmur...@sonic.net>: > > Well, first, the historical target for accuracy of WAN time service is more > > than an order of magnitude higher than 1ms. > > Time marches on. We need to do better today, much better. > > NTP is used on LANs.
Then we'll need to go to watching for GC pauses and skipping samples that might have been distorted by them. > > turning GC off > > Is that lightweight or heavyweight? > > How does that interact with threads? It's a fast operation, if that's what you mean. The way Go GC works requires that there is only one GC-enable flag, not one per thread. The flag tells the Go runtime whether or not to GC when the normal memory-usage threshold is reached. > What happens if there are lots of threads and they are all turning > it off/on very frequently and probably overlapping? That flag has to be protected by a mutex, and you have whatever value happened to be set last regardless of how many threads are running. If we think contention for that lock is going to be an issue, there's a pretty standard and simple way of dealing with it using an auxiliary semaphore. > I'm assuming the mainline server path won't require any allocations > or frees. Total CPU time to process a simple request is under 10 > microseconds. The main source of memory churn is going to be allocations for incoming packets, and deallocations when they're no longer referenced anf get GCed. Allocations are fast. GC is slow, but isn't performed very often. > Is there a subset of Go that doesn't use GC? Or someting like that. Not really. If you want to not use GC, you turn GC off. Then everything works as it normally does but your mnemory usage grows without bound until you re-enable GC, which could trigger an immediate GC sweep. I analyzed this years ago and discovered two kinds of code span where unexpected latency spikes could mess things up. One is right around where the adjtimex call or equivalent is done. That's a very narrow code section that's going to run in near constant time and not do any allocations; we can guard it just by turning GC off at the start of the span and on at the end so that any other threasd that *is* doing allocations cannot induce a latency spike during the critical section. The other is during sample collection from local refclocks. That's a little trickier because the read from device is a blocking operation that can and will do memory allocation. I think what we have to do in that case is take a timestamp before the read, then after it check to see if there was a GC between that timestamp and now, and if so discard the sample. Outside those places the code is not really stall-sensitive because all the data flying around has enough timestamping. With these mitigation measures I think performance can be expected to be C-like, except that one in a great while a GC stop will be detected to have occured during refclock sampling and cause a that sample to get tossed out. I say "once in a great while" because a program with ntpd's memory usage pattern is not going to trigger GCs very often. Most of the passes through critical regions won't collide with a GC latency spike. We can log these exceptions to check, of course. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> _______________________________________________ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel