Michael Ludwig skribis 2010-03-10 10:34 (+0100): > Okay. Let me try to see if I have understood correctly. Without the utf8 > pragma in scope, "so\xa0ein\xa0Käse" with a-Umlaut stored as a sequence > of two bytes in my source code will be stored internally as a sequence > of 12 integers. With the utf8 pragma in scope, only 11 integers.
"so\xa0ein\xa0Käse" must be stored as either: l1: 73 6f a0 65 69 6e a0 4b e4 73 65 (UTF8 flag off) or: u8: 73 67 c2-a0 65 69 6e c2-a0 4b c3-a4 73 65 (UTF8 flag on) Both strings should be semantically equal, and have 11 characters, each of which has an integer ordinal value. What happens is the following: 73 6f a0 65 69 6e a0 4b c3-a4 73 65 (UTF8 flag on) l1 l1 u8 This is wrong. It is a bug. -- Met vriendelijke groet, // Kind regards, // Korajn salutojn, Juerd Waalboer <ju...@tnx.nl> TNX