At 18:16 +0000 2001-11-03, Markus Kuhn wrote:
>
>On Sat, 3 Nov 2001, Eli Zaretskii wrote:
>>  > ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO-4.html
>>
>>  This is still silent about Grep, Sort, and tr, which are
>>  the utilities where the non-ASCII support should be a non-trivial
>>  change.
>>
>>  Basically, even after reading that page (which told me something I
>>  didn't know in some cases), Unicode support in basic development
>>  tools is still very much rudimentary.
>
>In practice, Perl has long ago replaced grep, sort, tr, awk, for all but
>sentimental reasons. Most of these little silly things were written as
>inefficient separate C processes before 1975 for the sole reason that the
>PDP-11 that Ritchie and Thompson used had only 64 kB RAM and couldn't
>handle any larger multi-function tools:
>
>http://www.bell-labs.com/history/unix/
>http://www.bell-labs.com/history/unix/firstport.html
>
>Today, these tiny tools mostly lead people to write extremely inefficient
>shell scripts that spend 90% of their time in fork().
>
>UTF-8 support for Perl is in an advanced state, and for some more
>experienced UTF-8 users, "grep", "sort", "tr", etc. are merely convenient
>and nostalgic shell functions or scripts that call perl to do the job.
>
>[I sometimes wish, we could give up the classic Bourne-style shell with
>it's baroque Algol-inspired syntax entirely and that perl had the few
>facilities (e.g., prompts, readline-history, compact
>command-invocation/argv/piping/redirecting notation, etc.) that are still
missing before we can turn it into the main command-line shell.]

What a cheek calling "the classic Bourne-style shell" "baroque" when 
compared to Perl! Baroque originally meant "Bizarre" in French, and 
now it means the same as in English: irregular, grotesque, odd, 
singular, or pertaining to a style of music and architecture of the 
17th and 18th centuries (also known as rococco) -- a style noted for 
excessive, extravagant ornamentation and embellishment. That fits 
Perl to a T, and it has nothing in common with the "classic shells" 
and the "tiny tools", which are clean, sparse, well crafted, 
streamlined, a designer's classic.

Lets face it. Perl is a powerful 4GL, and that's why people use it: 
it is also better suited for CGI Scripts than its few contenders. 
Like many powerful things, it is ugly, inconsistent, quirky, and can 
be dangerous to the unwary. In particular, its main feature seems to 
be that it accomplishes almost everything as side effects to what its 
commands ostensibly do. Since it seems to have no consistency from 
command to command, and documentation on some of the side-effects is 
sketchy to say the least, it is a nightmare to tyros. Certainly it is 
difficult to debug compared to "the classic Bourne-style shell", and, 
like Ada, it is so huge with so many ways to do the same thing that 
you need to be using it every day just to exercise a quarter of its 
warty features. Also, unless there has been a LOT of tuning in the 
last two versions, there are some classes of problems which it 
doesn't go all that fast, either.

Like David Starner, I stick to "these little silly things" unless I 
really can't do what I need to without terrible contortions, or 
having to write my own C programs: the other 0.05%, I use Perl.

Making "grep", "sort", "tr", etc. UTF-8-native is not going to be a 
simple task, however, unless Unix/Linux/???BSD have full support, 
including built-in collation-sequence routines and a more elaborate 
locale structure than now seems to be supported.

Perhaps Markus SHOULD have said "My interest is in getting Perl 
UTF-8-native because I use it, because there is a lot of interest in 
using it for CGI programming where being UTF-8-native is needed 
yesterday, and because it can do all the older routines can do in a 
pinch. Those who see a higher priority for the classic routines 
should pitch in and do them themselves."

George
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to