Am 2010-10-19 um 20:00 schrieb Paul McNett: > > I don't understand why this wouldn't have decoded from UTF-8, but > would decode from > latin-1. I thought UTF-8 was a superset of latin-1.
You want to read http://www.joelonsoftware.com/articles/Unicode.html and http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror and http://docs.python.org/library/functions.html#unicode Knowing Unicode is essential, no excuses. (Even if you're probably a far better programmer than me in every other regard.) This can't work: > 369 try: > 370 _records = self.fetchall() > 371 except Exception, e: > 372 _records = dabo.db.dDataSet() > 373 # Database errors need to be decoded from database > encoding. > 374 try: > 375 errMsg = ustr(e).decode(self.Encoding) > 376 except UnicodeError: > 377 errMsg = unicode(e) If self.Encoding is wrong (thus UnicodeDecodeError), unicode(e) won't help: If you don't give an encoding in "unicode(string, encoding, errors)", 7bit ASCII is assumed, that's why you get: > <type 'exceptions.UnicodeDecodeError'>: 'ascii' codec can't decode... So, if you don't know the correct encoding, you could try to guess - decoding with a 8bit encoding like latin-1 shouldn't give an error, since every byte is valid (even if nonsense). Setting "errors" to "ignore" or "replace" could help, too. (See Python docs, link above.) (I didn't look up what Dabo's ustr does.) In the opposite case, if you need plain ASCII, e.g. for a filename, you could use something like: def force_ascii(u): return unicodedata.normalize('NFKD', u.lower()).encode('ASCII', 'ignore') This first normalizes to decomposed unicode (separate base letters and accents) and later throws any non-7bit-ASCII away, so you get 'a' from 'á'. Greetlings from Lake Constance! Hraban --- http://www.fiee.net https://www.cacert.org (I'm an assurer) _______________________________________________ Post Messages to: [email protected] Subscription Maintenance: http://leafe.com/mailman/listinfo/dabo-dev Searchable Archives: http://leafe.com/archives/search/dabo-dev This message: http://leafe.com/archives/byMID/[email protected]
