On Tue, Feb 03, 2004 at 01:33:56PM -0500, Henry Spencer wrote: : On Mon, 2 Feb 2004, srintuar wrote: : > Certainly, being flexible enough to handle "internal use only" code : > points will not break comatibility with the spec, since it is : > entirely possible to use a pipe internally. : : For conformance with the spec, there ought to be at least an option to : flag such code points as errors, since that *is* called for by the spec : when using the facility for external I/O. (Moreover, it is good software : engineering practice to catch such code points whenever they are not : legitimate; ignoring errors does nobody any favors.)
Sure, but there has to be some balance. Designing a particular computing environment (such as a language) requires that some attention be paid to efficiency and freedom as well as standards conformance. It's certainly the case that you want to screen out the terrorists at the border, and also have internal mechanisms in place to find the ones that sneak past the border, or sprout at home. However, it's clearly bad for the economy to put a checkpoint at every intersection, and it's clearly a violation of civil liberties to apply airport standards of security when you want to carve a turkey or blow up a stump. In the case of Perl (or, at least, Perl 5), I made the decision that, within the language, strings should be thought of simply as abstract sequences of arbitrary integers, and that by default the standards enforcement should be at the borders. That was the right decision for Perl 5, in which strings are essentially typeless, and you shouldn't know or care whether Perl internally stores your Latin-1 as a sequence of bytes or as UTF-8. Only the I/O layers (and other system interface) have to care about that. In practice this has worked out very well. Now in Perl 6, strings will at least be typeable, so one can make a good case for more checking internally, simply because it will become more possible to determine when checking is appropriate. But it's still not clear that checked types should be the default. It depends on whether you think unlimited abstract characters are more like machine guns or kitchen knives. I tend to vote for kitchen knives, but then I've always tried to err on the side of freedom of expression. The best political systems try to enforce standards only where they are really needed. And where possible, the large political bodies should delegate the fine-grained distinctions to the smaller political bodies and other local societies. It's my local government that cares how I plumb my house, but doesn't care how I plumb my fishtank. I can't chop down the tree on the street without permission, but I'm allowed to water my lawn (at least this year). If I lived where I grew up in Pacific Northwest, they probably wouldn't care about either the tree or the watering (at least this year). Local policy simply does a better job than global policy of taking context into account. And I think that's how it should be, when it can be. I'm not against good engineering discipline in computer programming. But designing a computer language is not like designing a bridge. It's more like designing a society in which it's efficient to design a bridge of any kind, from the Golden Gate to a fallen log. Asking whether Perl conforms to the Unicode specs is a bit like asking whether English conforms to the Elements of Style. Larry -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
