So I've added unicode support to my dbf package, but I also have some rather large programs that aren't ready to make the switch over yet. So as a workaround I added a (rather lame) option to convert the unicode-ified data that was decoded from the dbf table back into an encoded format.

Here's the fun part: in figuring out what the option should be for use with my system, I tried some tests...

Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'\xed'
í
>>> print u'\xed'.encode('cp437')
í
>>> print u'\xed'.encode('cp850')
í
>>> print u'\xed'.encode('cp1252')
φ
>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'cp1252')

My confusion lies in my apparant codepage (cp1252), and the discrepancy with character u'\xed' which is absolutely an i with an accent; yet when I encode with cp1252 and print it, I get an o with a line.

Can anybody clue me in to what's going on here?

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to