> # cd /usr/src/usr.sbin/ntpd/
> # for i in $(find . -name "*.[ch]"); do cat $i >> /root/allcode; done
> # egrep -v '[:blank:]*/?\*' /root/allcode | grep -v "^ *$" | wc -l
>     2898
....
> > $ for i in $(find . -name "*.[ch]"); do cat $i >> allcode; done
> > $ egrep -v '[:blank:]*/?\*' allcode | grep -v "^ *$" | wc -l
> >   192870
> >
> > This is ntp-4.2.8  A rough estimate but close enough if we are comparing to
> > a know solution that is <5000.

That is a factor of 66x.  Shocking, it is larger than I remembered.

So, what does that extra source code bring to the table?

My perspective is that big code brings holes because fewer people want
to read, audit, and maintain the code.

But surely there must be some benefit.  It has been claimed NTPD is
more accurate.  For that extra size, is it 66x more accurate?  That's
silly.  Even if it is a little bit more accurate, what does 'more
accurate' mean when edge devices that care for more than ~100ms
accuracy are exceedingly rare?

The devices which care for accuracy are doing time distribution, not
time reception at the network edge.  And there lies the problem, I
think.  Much of that code is likely for special purposes in an
ecosystem which includes products, relationships with companies
building products, special modes, special features, labratory code,
etc.  The result was that this team built a NTP daemon for all
purposes great and small:

    One NTPD to rule them all,
    One NTPD to find them,
    One NTPD to bring them all
    and on the internet hole them

I am going to guess the serious time distribution people are in
control of this software stack, but 99% of their base is in the edge
devices where the extra complexity is unwarranted?  (Is it time for a
coup?)

I think the NTP people don't even know their own codebase because it
is too large and full of legacy code.  They don't know what parts of
it do, but they don't want to "upset their base" by removing any of
it.

In another project we recently reviewed, there was code all over the
place for some rotten platforms -- barely limping along -- but the
existance and structure of that code would would be cumbersome and the
effects would be detrimental to all platforms.  Developers trying to
change the code make mistakes -- or worse they get demoralized and
stop trying.  Such codebases purport to be about agility, but as it
tries to reach that goal which is increasingly less relevant, it hurts
the most important goals.  You cannot properly serve 1995 and 2015.

Back to NTPD, I suspect most of the people who struggled through it in
the last decade have been looking for holes to exploit.  It does not
seem healthy.  Healthy projects struggle with the effort of
refactoring their code.  Unhealthy project sit on their past results,
and the future catches up.

Certainly the codebase was also being run by strong-minded people,
I know the type.

The NTP codebase is larger than the SSH codebase.

Explain that, and then explain why noone visibly called them on that.


I recall the conversations I had with the NTP team bit around 10 years
ago.  They did not want help auditing their code.  It was sad, and I
found a more receptive audience in Henning.  Once he wrote openntpd,
we had to face public attacks from NTP team members.  Apparently the
openntpd math was bad and we would break internet time..  internet
still seems fine, except a ntp clients spun up to 100 peers is seeing
a fair number of ntp time distributions points are down...


I am glad to hear NTPD is being rewritten.  Because over time this
will bring safety back back into focus.  PHK, I hope you don't just
focus on the math.  Privsep it, please, but ensure DNS refresh is
possible in the architecture you choose.  There is a time-tested model
which does not require the OS to have this or that non-standard
security feature; we all know we can't mandate use of those features
which barely help anyways.  You rely on privsep everytime you login to
another machine, so consider it.



Very hard to push pressure upstream.

Reply via email to