On Tue, Feb 03, 2004 at 01:33:56PM -0500, Henry Spencer wrote:
: On Mon, 2 Feb 2004, srintuar wrote:
: > Certainly, being flexible enough to handle "internal use only" code
: > points will not break comatibility with the spec, since it is
: > entirely possible to use a pipe internally.
: 
: For conformance with the spec, there ought to be at least an option to
: flag such code points as errors, since that *is* called for by the spec
: when using the facility for external I/O.  (Moreover, it is good software
: engineering practice to catch such code points whenever they are not
: legitimate; ignoring errors does nobody any favors.)

Sure, but there has to be some balance.  Designing a particular
computing environment (such as a language) requires that some attention
be paid to efficiency and freedom as well as standards conformance.
It's certainly the case that you want to screen out the terrorists at
the border, and also have internal mechanisms in place to find the ones
that sneak past the border, or sprout at home.  However, it's clearly
bad for the economy to put a checkpoint at every intersection, and
it's clearly a violation of civil liberties to apply airport standards
of security when you want to carve a turkey or blow up a stump.

In the case of Perl (or, at least, Perl 5), I made the decision that,
within the language, strings should be thought of simply as abstract
sequences of arbitrary integers, and that by default the standards
enforcement should be at the borders.  That was the right decision for
Perl 5, in which strings are essentially typeless, and you shouldn't
know or care whether Perl internally stores your Latin-1 as a sequence
of bytes or as UTF-8.  Only the I/O layers (and other system interface)
have to care about that.  In practice this has worked out very well.

Now in Perl 6, strings will at least be typeable, so one can make a
good case for more checking internally, simply because it will become
more possible to determine when checking is appropriate.  But it's
still not clear that checked types should be the default.  It depends
on whether you think unlimited abstract characters are more like
machine guns or kitchen knives.  I tend to vote for kitchen knives,
but then I've always tried to err on the side of freedom of expression.

The best political systems try to enforce standards only where they
are really needed.  And where possible, the large political bodies
should delegate the fine-grained distinctions to the smaller political
bodies and other local societies.  It's my local government that
cares how I plumb my house, but doesn't care how I plumb my fishtank.
I can't chop down the tree on the street without permission, but I'm
allowed to water my lawn (at least this year).  If I lived where I
grew up in Pacific Northwest, they probably wouldn't care about either
the tree or the watering (at least this year).  Local policy simply
does a better job than global policy of taking context into account.
And I think that's how it should be, when it can be.

I'm not against good engineering discipline in computer programming.
But designing a computer language is not like designing a bridge.
It's more like designing a society in which it's efficient to design
a bridge of any kind, from the Golden Gate to a fallen log.

Asking whether Perl conforms to the Unicode specs is a bit like asking
whether English conforms to the Elements of Style.

Larry

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to