> I don't understand: > - I expected the two first tries to work > - and the last one to fail. > What happened is the exact opposite! > I am totally confused. > I don't even know how to ask my question properly. > > I think that I understand what "é" and %E9 are... > but I do not understand what é is... > moreover it is two characters "Ã" and "©" instead > of one... > > Thank you for your help :) . > Best regards, > -- > Lmhelp
The letter é has the codepoint 0xE9 in Unicode. If the file is written in iso-8859-1, it is represented by just one byte: 0xE9 (é) If the file is written in utf-8, it is represented by two bytes: 0xC3 0xA9 (é) If the file is written in utf-16, it is represented by two bytes: 0x00 0xA9 in utf-16 BE and 0xA9 0x00 in utf-16 LE. The line <?xml version="1.0" encoding="UTF-8"?> says "this file will be in utf-8". If you then write "Etoilé " as 0x45 0x74 0x6f 0x69 0x6c 0xe9 0x20, that makes invalid XML, since it should have been 0x45 0x74 0x6f 0x69 0x6c 0xc3 0xa9 0x20 (alternatively, you could have specified a different encoding in the prolog). The use of %E9 is just a trick for urls, since they may not allow a literal "é" there (this url é would be encoded in iso-8859). It only appears in robots.txt because it talks about urls. _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
