Gilles Ganault wrote: > I must be dense, but I still don't understand 1) why Python sometimes > barfs out this type of error when displaying text that might not be > Unicode-encoded, 2) whether I should use encode() or decode() to solve > the issue, or even 3) if this is a Python issue or due to APWS SQLite > wrapper that I'm using: > > ====== > sql = 'SELECT id,address FROM companies' > rows=list(cursor.execute(sql)) > > for row in rows: > id = row[0] > > #could be 'utf-8', 'iso8859-1' or 'cp1252' > try: > address = row[1]
Assuming row is a tuple with len(row) >= 2 the above line can never fail. Therefore you can rewrite the loop as for row in rows: id, address = row[:2] print id, address > except UnicodeDecodeError: > try: > address = row[1].decode('iso8859-1') > except UnicodeDecodeError: > address = row[1].decode('cp1252') > > print id,address > ====== > 152 Traceback (most recent call last): > File "C:\zip.py", line 28, in <module> > print id,address > File "C:\Python25\lib\encodings\cp437.py", line 12, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\xc8' in > position 2 > 4: character maps to <undefined> It seems the database gives you the strings as unicode. When a unicode string is printed python tries to encode it using sys.stdout.encoding before writing it to stdout. As you run your script on the windows commmand line that encoding seems to be cp437. Unfortunately your database contains characters the cannot be expressed in that encoding. One workaround is to replace these characters with "?": encoding = sys.stdout.encoding or "ascii" for row in rows: id, address = row[:2] print id, address.encode(encoding, "replace") Example: >>> u"ähnlich lölich üblich".encode("ascii", "replace") '?hnlich l?lich ?blich' Peter -- http://mail.python.org/mailman/listinfo/python-list