On 1/11/07, Ian Bicking <[EMAIL PROTECTED]> wrote: > > > You probably want to do: > > c.value = u"something in UTF-8" > > I'm not trying to be pedantic... and I think that we all know what you meant by the above... just trying to make sure I understand this stuff... a few years ago I went over and over this with a toolchain including PostGreSQL and webkit and apache and of course the browser... and they ALL had to be lined up, or things got broken. Once we understood one fact, it made it easier to troubleshoot:
My understanding is that what we call "utf-8" is - *is* - *IS* ascii... the ascii representation of unicode, /encoded/ into ascii via the 'utf-8' encoding method. That's important to repeat: utf-8 IS NOT unicode. It's a way to STORE unicode in 8-bit bytes (strings). "Unicode" is multi-byte strings, and unicode is unicode is unicode, right? So, a little console log (hope it makes it through smtp): >>> u = u'göøbér' >>> u u'g\xf6\xf8b\xe9r' >>> print u göøbér >>> type(u) <type 'unicode'> >>> s = u.encode() >>> s 'g\xc3\xb6\xc3\xb8b\xc3\xa9r' >>> print s göøbér >>> type(s) <type 'str'> In the above, I'm not sure why the repr() of u has different encoding than utf-8... the o-umlaut is escaped \xf6 whereas in the utf-8 string, it's \xc3\xb6. Weird. Anyway, anything that is 'utf-8' is just ascii, and it should make it through templates just fine. If the template (or browser) attempts to decode it improperly, you get output like this: 'gö'. Usually trying to "fix" it is hopeless... it's been mistranslated somewhere up the toolchain, and one can't reverse-patch it to fix it (though it would be theoretically possible... as it's just look-up-tables). ok. I'm done. </pedantic> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
