I wrote:
> If I get some more time over the week,
> I may select and dissect pieces of the code.
Here's an example of how J's clarity and suggestivity work together to
improve programs. I was looking at
category =: 4 <. 1 + 35 30 20 I. (* 2&<:)
and thinking: that seems like a lot of magic numbers. I'd like to abstract
those out and keep them as data (so they stand out better and are easier to
maintain). My first thought was to name the vector 35 30 20 and put it in
the data section. But then you still have that magic 4 in there, and naming
that separately seems messy and error prone (if the 4 and the vector get
separated).
But before I even completed that thought, the juxtaposition of the two nouns
made the origin of the magic 4 immediately clear: it's one more than the
length of the levels vector! It's the default because it's the bottom
category. So the two really are inextricable, and I don't have to worry
about separating them. If I named the levels vector, e.g. LVLS=:35 30 20
then I could just express the 4 as 1+#LVLS .
In fact, we could factor out the 1+ in 'category' and the 1+ in '1+#LVLS'
into a single increment, thus:
category =: 1 + (#LVLS) <. LVLS I. (* 2&<:)
But, having written this, the notation immediately makes another suggestion:
the whole <. thing is superfluous. For any value less than 20, the lowest
in LVLS, the dyad I. will automatically return 3 (and the 1+ will make it
4). The bottom category becomes the default perforce! The array-orientation
of I. has smoothed out an edge case, without us even having to think about
it (a not-uncommon experience in J).
Ok, so now we can write:
category =: 1 + LVLS I. (* 2&<:)
Pretty good! But look, there's still one more magic number. What's that (*
2&<:) all about? Well, if you remember, this verb calculates the rider's
category from his total score (summed across all his races) and the number
of races he's ridden. So total_score (* 2&<:) number_races is total_score *
2 <: number_races. Which just forces any rider's score to zero, unless he's
biked in at least two races. If he biked in just one race -- no matter how
high his score -- then this function would reduce his effective score to
zero, which is less than twenty, which (through I.'s array-orientation) puts
him in category 4.
But, looking at the scoring tables preceding the logic, it becomes clear
that the maximum score for any one race (coming in first in a race with 50
or more riders) is 10. Which is also less than 20. Which means it's not
necessary to force single-race scores to zero; they're going to end up in
category 4 anyway. So the (* 2&<:) too, is superfluous, and we can reduce
the category function to its absolute essentials:
category =: 1 + LVLS I. [
Here, we've successfully removed all the undocumented magic numbers, and
allowed more of the solution to be managed as data, rather than code. It
will be easier to maintain and extend.
But something more important happened. Look at the final definition of
category again. Notice: there is no mention of race count. Which tells us
that the category, the heart of our algorithm, is a simple step function as
the total score passes each threshold of LVLS. We have not only improved
the solution, we have clarified the problem.
You'll notice this analysis was entirely intellectual; similar to reduction
of an algebraic expression. J's amenability to algebraic manipulation, its
abstraction over arrays, its well-designed primitives, and its suggestive
syntax have combined to allow us to greatly simplify our program, without
even starting the interpreter!*
-Dan
* "Beware of bugs in the above code; I have only proved it correct, not
tried it."
-- Donald Knuth, http://www-cs-faculty.stanford.edu/~uno/faq.html#gzip
[Of course, J makes it easy to test theories like this.]
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm