On Thu, Aug 11, 2011 at 04:00:21PM -0300, Jecel Assumpcao Jr. wrote:
> The original Smalltalk-80 from Xerox used CR as its line separation
> character, but the really big external influence on Apple was UCSD
> Pascal which shared that convention. Apple, however, (along with
> Commodore and Tandy/Radio Shack - the big 3 from 1977) had already
> adopted this convention from the start.

Fascinating! Thank you!  I've been wondering the same thing for years.

> DEC had adopted the LF+CR style required by teletype terminals and this
> was copied in CP/M, then MS DOS and finally Windows.

I know that technically + is commutative, but it's worth pointing out that DEC,
Intergalactic Digital Research, and Microsoft all used a CR followed by an LF,
not an LF followed by a CR, although either order works on a teletype, or in
the terminal emulator I'm using now.

> http://en.wikipedia.org/wiki/Newline

An excellent article, thank you. It points out that the BBC Micro *did* use an
LF followed by a CR.

In addition to the various eccentric systems listed on the WP page, ASCII
itself has a four- or five-level hierarchy in its control characters (FS, GS,
RS, and US, and arguably EOT), and ASCII-1963 (the version in which ^ is
uparrow and _ is leftarrow!) had eight or nine.  I highly recommend Tom
Jennings's excellent history [An annotated history of some character codes or
ASCII: American Standard Code for Information Infiltration] [0].

[0]: http://www.wps.com/projects/codes/

The [Pick][] operating system (devoted to business data processing on
minicomputers, mostly) is the only thing I know of that used such a thing in
its normal file format: a Pick "file" is like a Unix "directory", containing
"records" that are like Unix "files".  Within each "record" (in 1970s versions
of Pick, limited to 32K) "items" are separated by "item marks" (character 255);
each item consists of "attributes" separated by "attribute marks" (character
254), and so on with "values" (253) and "subvalues" (252).

[Pick]: http://jes.com/pb/pb_wp1.html

Getting back to the reinvention of computing:

I've never used Pick, although it has a sort of cult following.  I've noticed,
though, that very simple expedients like this data representation often allow
simple things to be done surprisingly simply.  You can grep and cat and diff
and regexp-extract text files in useful ways surprisingly often.  (This was the
"Desperate Perl Programmer" desideratum for XML, although I think XML pretty
much failed to deliver on that promise.)  I think this is what Larry Wall meant
by "whipupitude".

The tradeoff, of course, is that it's terrifically difficult to write reliable
programs that work with Unix text file formats, because it's easy to do "the
wrong thing" and have it work by accident on your first thousand tests.  Shell
scripts' notorious intolerance for filenames with whitespace is probably the
best example of this kind of thing.  But also, if I grep my access_log, I can
grep it without specifying what field I'm interested in --- at the expense of
possible false positives.

Is it possible to get the whipupitude without the error-proneness, and without
sacrificing expressivity?  

Pick and Unix show the merit of Alan Perlis's epigram, although of course
Perlis was not particularly fond of them:

> It is better to have 100 functions operate on one data structure than 10
> functions on 10 data structures.

Today we might instead say: it is better to have 100 classes that are clients
of one protocol than 10 classes as clients of each of 10 protocols.  (And that
is the reason I am writing this email in vim: because I can use it via ssh
inside of screen, and there are thousands of programs that speak the VT100
protocol, all of which can be successfully remoted with ssh and multiplexed and
fault-tolerancified with screen.)

(I wish I had a neat conclusion to put here. Thoughts?)

Kragen

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to