On 26 Aug 2008, at 1:31 pm, Deborah Goldsmith wrote:
You can't determine Unicode character properties by analyzing the
names of the characters.
However, the OP *does* have a copy of the UnicodeData...txt file,
and you *can* determine the relevant Unicode character properties from
that.
For example, consider the entry for space:
0020;SPACE;Zs;0;WS;;;;;N;;;;;
^^
The Zs bit says it's a white space character
(Zs: separator/space, Zl: separator/line, Zp:
separator/paragraph).
Or look at capital A:
0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;^
^^
The Lu bit says it's a L(etter) that is u(pper case).
Upper case: Lu, lower case: Ll, title case: Lt,
modifier letter: Lm, other letter: Lo, digit: Nd,
...
If memory serves me correctly, this is explained in the
UnicodeData.html file, under a heading something like
Normative Categories.
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe