To do large code changes correctly, factor them into a series of
smaller steps such that each revision has a well-defined and provable
relationship to the last.
(This is the closest I've ever come to a 1-sentence answer to the question
"How the *fsck* do you manage to code with such ridiculously high speed
and low defect frequency? I was asked this yet again recently, and trying to
translate the general principle into actionable advice has been on my
mind. I have two particular NTPsec contributors in mind...)
So here's a case study, and maybe your chance to catch me in a mistake.
NTP needs a 64-bit scalar type for calendar calculations; what it
actually wants is 32 bits of seconds since a far-past epoch and 32
bits of fractional-second precision, which you can think of as a
counter for units of seconds * 1e-32. (The details are a little
messier than this, but never mind that for now.)
Consequently, one of the archaisms in the NTP code is an internal type
called vint64. It dates from the era of 32-bit machines (roughly
1977 to 2008). In those days you couldn't assume your C compiler had
int64_t or uint64_t (64-bit integer and unsigned-integer types). Even
after the 64-bit hardware transition, it was some years before you could safely
assume that compilers for the remaining 32-bit machines (like today's
Raspberry Pis) would support int64_t/uint64_t.
Thus, a vint64 is an NTP structure wrapping 2 32-bit integers. It
comes with a bunch of small functions that do 64-bit scalar arithmetic
using it. Also, sadly, there was a lot of code using it that didn't go
through the functional interface, instead exposing the guts of the
vint64 structure in unclean ways.
This is, for several reasons, an obvious cleanup target. Today in
2016 we *can* assume that all compilers of interest to us have 64-bit
scalars. In fact the NTP code itself has long assumed this, though
the assumption is so well-hidden in the ISC library off to the side
that many people who have worked in the main codebase probably do
not know it's there.
If all the vint64s in NTP became typedefs to a scalar 64-bit type, we
could use native machine operations in most cases and replace a lot of
function calls and ugly exposed guts with C's arithmetic operators.
The result would be more readable, less bulky, and more efficient. In
this case we'd only pare away about 300LOC, but relentless pursuit
of such small improvements adds up to large ones.
I tripped over this yet again a day and a half ago while working on
TESTFRAME and decided to fix it. (Fixing it will make TESTFRAME just
a tad easier.)
The stupid way to do it would have been to try to go from vint64 to
int64_t/uint64_t in one fell swoop. NSF and LF didn't engage me
to be that stupid.
Quoting myself: "A series of smaller steps such that each revision has
a well-defined and provable relationship to the last."
Generally, in cases like this, the thing to do is separate changing
the interface from changing the implementation. So:
1. First, encapsulate vint64 into an abstract data type (ADT) with an
entirely functional interface - un-expose the guts.
2. Then, change the implementation (struct to scalar), changing the
ADT methods *without disturbing any of the outside calls to them* - if
you have to do the latter, you failed step 1 and have to clean up your
abstract data type.
3. Finally, hand-expand the function calls to native C scalar
operations. Now you no longer have an ADT, but that's OK; it
was scaffolding. You knew you were going to discard it.
The goal is that at each step it should be possible, and relatively
easy to eyeball-check that the transformation you did is correct.
Helps a lot to have unit tests for the code you're modifying - then,
one of your checks is that the unit tests don't go sproing at any
step. If you don't have unit tests, *write them*. They'll save your
fallible ass. The better your unit tests are, the more time and pain
you'll save yourself in the long run.
OK, so here's you chance to catch me in a mistake.
That is the diff where I pull all the vint64 guts exposure into a ADT (done
with macro calls, not true functions, but that's a C implementation
Can you find an error in this diff? If you decide not, how did it
convince you? What properties of the diff are important?
(Don't pass over that last question lightly. It's central.)
If you're feeling brave, try step 2. Start with 'typedef uint64_t vint4;',
replacing the structure definition, and rewrite the ten macros near
the beginning of the diff. (Hint: you'll need two sets of them.)
Word to the newbies: *this is how it's done*. Train your brain so that
you analyze programming this way - mostly smooth sequences of
refactoring steps with only occasional crisis points where you add a
feature or change an assumption.
When you can do this at a microlevel, with code, you are inhabiting the
mindset of a master programmer. When you can do it with larger design
elements - data structures and entire code subsystems - you are
getting inside system-architect skills.
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
What is a magician but a practicing theorist?
-- Obi-Wan Kenobi, 'Return of the Jedi'
devel mailing list