On Wed, 12 Mar 2008 Juerd Waalboer wrote
Chris Hall skribis 2008-03-12 20:49 (+):
a. are you saying that characters in Perl are Unicode ?
Yes. They are called Unicode, at least. This has my preference for
explanation and documentation.
b. or are you agreeing that characters in
Chris Hall skribis 2008-03-12 13:20 (+):
OK. In the meantime IMHO chr(n) should be handling utf8 and has no
business worrying about things which UTF-8 or UCS think aren't
characters.
It should do Unicode, not any specific byte encoding, like UTF-?8.
IMHO chr(n) should do characters,
On Wed, 12 Mar 2008 Juerd Waalboer wrote
Chris Hall skribis 2008-03-12 13:20 (+):
String literals are represented by UCS code points. Which
reinforces the feeling that characters in Perl are Unicode.
Yes!
OK. For the avoidance of doubt:
a. are you saying that
On Tue, 11 Mar 2008 you wrote
Chris Hall skribis 2008-03-11 18:48 (+):
I'm comfortable with the notion that perl characters are unsigned
integers that overlap UCS, and happen to be held internally as a
superset of UTF-8.
I wonder if perl is completely comfortable.
It isn't. There are
Chris Hall skribis 2008-03-11 21:09 (+):
OK. In the meantime IMHO chr(n) should be handling utf8 and has no
business worrying about things which UTF-8 or UCS think aren't
characters.
It should do Unicode, not any specific byte encoding, like UTF-?8.
Internally, a byte encoding is