Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet David MENTRE
Hello Lyu, 2007/10/22, Lyu Abe [EMAIL PROTECTED]: There's one thing I do not understand in character coding of the server's reply. When I display, for example, tag sets, I can read this: 'a_tag_label': u'citoyennet\xe9' in which u'citoyennet\xe9' corresponds to an unicode encoded text,

Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet Thomas Petazzoni
Hi, Le Mon, 22 Oct 2007 14:40:46 +0900, Lyu Abe [EMAIL PROTECTED] a écrit : There's one thing I do not understand in character coding of the server's reply. When I display, for example, tag sets, I can read this: 'a_tag_label': u'citoyennet\xe9' in which u'citoyennet\xe9' corresponds

Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet Lyu Abe
Hi Thomas and David, Thanks for the clarification! Lyu. Thomas Petazzoni a écrit : Hi, Le Mon, 22 Oct 2007 14:40:46 +0900, Lyu Abe [EMAIL PROTECTED] a écrit : There's one thing I do not understand in character coding of the server's reply. When I display, for example, tag sets, I

Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet David MENTRE
Hello Thomas, 2007/10/22, Thomas Petazzoni [EMAIL PROTECTED]: The string you mention is encoded in ISO-8859-1 (or ISO-8859-15) : the special character é is encoded on one byte only, so it's not UTF-8. I'm not sure of that. If you look at the Unicode table for Latin1

Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet Thomas Petazzoni
Hi, Le Mon, 22 Oct 2007 09:18:23 +0200, David MENTRE [EMAIL PROTECTED] a écrit : I'm not sure of that. If you look at the Unicode table for Latin1 (http://www.unicode.org/charts/PDF/U0080.pdf), the encoding of é is 00E9 (p. 7). I'm not sure too :-) On a system with LANG=fr_FR, I run a

Re: [Demexp-dev] Character encoding

2007-10-22 Par sujet David MENTRE
Hi Thomas, 2007/10/22, Thomas Petazzoni [EMAIL PROTECTED]: But even with that, I'm still not sure to understand completely. These encodings issues are really tough to grasp. Yep, I agree. I only hope we don't have an encoding mess in the official database. I'll need to check that. One more