Quoting "Eric S. Raymond" <[email protected]>:
I've spent the last week reading code and preparing for a serious
effort to write logfile visualization tools for NTPsec.

There are at least two good reasons to do this, one retrospective and
one prospective.  The retrospective one is that the stats and
data-reduction tools now in the distribution are a huge mess. They're
archaic, often embodying assumptions that have long since passed their
sell-by date (one pair of tools relies, for example, on mode 7, which
we've eliminated).  They're poorly documented or not documented at
all. They're written in Perl, which is a serious maintainability
problem. The whole area cries to be cleaned up - or better yet, nuked
and replaced with better code.

The prospective reason is that I need a way to make sense out of my test
farm data.  I want to be able to answer a bunch of questions, beginning with
"How important are check servers to a machine with an GPS?"

One of my NTP modules has this annoying habit of drifting multiple milliseconds while still producing a PPS (which is odd, it was claiming unlocked status for multiple hours). I think it was physically damaged in shipment. I don't use that module anymore. But it was useful to have other servers configured to verify what was going on.

LAN stratum 1 sources can measure offset wander in the tens of microseconds, with the right conditions. For example: https://dan.drown.org/rpi/pi2.html

The path forward that I'm considering is a Python translation of the
NTP branch of David Drown's chrony-graph software. It makes beautiful and
interesting visualizations, embodying a lot of domain knowledge about
which statistics and relationships are interesting.  And of course, that last
part is where my own knowledge is weakest. Co-opting his work will let me
concentrate on the software-engineering aspect of the problem.

My first name is Daniel. David is my dad's name by random chance. So unless he wrote NTP visualization software... :)

I'm thinking Python translation for two reasons.  One is our general
Python-and-sh policy for scripting, to reduce maintainance complexity
down the road.

Another is that, as Gary Miller has pointed out, ddrown's collection of
shellscripts and Perl has terrible locality.  Gary says he can see in
his graphs artifacts from chrony-graph's disk overhead, and I have no
reason to disbelieve that. Gary suggests that a symbiont daemon, keeping
intermediate data in memory until the final graphs need to be produced,
would produce less noise.

I wouldn't be surprised if it was from processor activity (instead of disk activity), actually.

On my Intel machine I generate all my graphs on, the time spent is broken down like this.

1. bin/run (excluding bin/plot), log filtering/processing = ~2 seconds
2. bin/plot = ~9 seconds
2a. bin/plot - just calls to bin/percentile and bin/histogram (perl) = ~2 seconds
2b. bin/plot - just calls to gnuplot = ~7 seconds
3. bin/copy-to-website, copying html/png to remote system = ~1 second

total script time:
real    0m12.166s
user    0m8.503s
sys     0m2.039s

Disk activity during this time:

0 read operations (everything came from cache)
144 write operations totaling 16MB taking 136ms

These numbers are going to be much slower on a Raspberry Pi, but they shouldn't be a drastic impact on the system when running every hour.


I experimented a bit to see if I could speed this up any. The biggest win was limiting the output of the bin/histogram program. After I do that, gnuplot is much faster (and the temporary data file loopstats.history is much smaller):

2. bin/plot = ~3 seconds
2a. bin/plot - just calls to bin/percentile and bin/histogram (perl) = ~1 second
2b. bin/plot - just calls to gnuplot = ~2 seconds

total script time:
real    0m5.984s
user    0m3.797s
sys     0m0.715s


So, translate chrony-graph to Python.  But this would leave us with
a coordination problem. It means either ddrown has to be prepared to
let the Python version be his new mainline, or we have to cross-port
all his improvements after the fork.

David (*Daniel), do you have any suggestions for making this less painful?

I don't see a compelling reason to switch to python. I guess I don't see the pain points.
_______________________________________________
devel mailing list
[email protected]
http://lists.ntpsec.org/mailman/listinfo/devel

Reply via email to