Haskell character set

Ryszard Kubiak Tue, 14 Jan 1997 17:56:24 +0100
I would like to complain about an unfortunate approach taken by the
authors of Haskell concerning the character type. The Haskell 1.4
Report says (and the 1.3 version was saying the same):

  The character type Char is an enumeration and consists of 256 values,
  conforming to the ISO 8859-1 standard .

One thing is that according to the ISO 8859-1 standard not all 256
one-byte characters are legal. Thus, the quoted sentence is imprecise.

A more important thing is that after restricting the character set to
ISO 8859-1 the East European programmers not always can use their
diacritic characters in Haskell programs. Of course, when it comes to
real programming most of us use English names. Still, diacritic
characters are important both in education and in printed materials.
There is another ISO standard, called 8859-2 or Latin-2, accepted in
Eastern Europe and commonly used on the Internet. I wonder if the
language report may take accept that standard too.

Consequently, I suggest changes in the standard library module Char. A
more appropriate name for it would be Char-Latin-1. But, even if the
name remains unchanged the comments in the source text should reflect
the fact that the utilities have been written with Latin-1 (or 8859-1)
in mind.

Most of us in Europe suffer from having all those extra accents, tails
and other fancy decorations with our letters. The ISO 8859 standards
help to cope with them on computers. A more radical solution comes, of
course, with Unicode. I wonder if functional languages aren't better
suited for making use of it than others in which the representation of
a string is so heavily memory-related. By saying that a string is just
a list of characters (almost :-) nothing remains to be done with
string-manipulating functional applications when the representation of
characters changes.

Greetings,
Rysiek
Haskell character set

Reply via email to