[EMAIL PROTECTED] wrote: > Hi, > > > My interpreter in set via sitecustomize.py to use utf-8 as default encoding. > > I'm reading fields from a dbf table to a firebird db with encoding set to > win1252. > I guess it's original encoding is cp850, but am not sure, and have been > addressing exceptions one by one with lines of: > > r = r.replace(u'offending_code', u'ok_code') > Why don't you just convert from cp850 to cp1252 directly? Python supports both encodings, it's as simple as some_string.decode('cp850').encode('cp1252') > But now it seems I've run into a brick wall! > I've this exception I can't seem to avoid with my strategy: > """ > Traceback (most recent call last): > File > "C:\Python24\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", > line 307, in RunScript > debugger.run(codeObject, __main__.__dict__, start_stepping=0) > File "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\__init__.py", > line 60, in run > _GetCurrentDebugger().run(cmd, globals,locals, start_stepping) > File "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\debugger.py", > line 631, in run > exec cmd in globals, locals > File "C:\Documents and > Settings\mel.TECNICON\Desktop\Statistics\fromDBF\import_pcfcli.py", line 216, > in ? > fbCr.execute(sqlTemplate, rec) > File "C:\Python24\Lib\site-packages\kinterbasdb\typeconv_text_unicode.py", > line 108, in unicode_conv_in > return unicodeString.encode(pyEncodingName) > File "C:\Python24\Lib\encodings\cp1252.py", line 18, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeDecodeError: 'utf8' codec can't decode byte 0x99 in position 33: > unexpected code byte > """ > > I also print the offending record so I can inspect it and apply my strategy: > ('', 'A', '', '', 'EDIFICIOS 3B,SOCIEDADE DE CONTRUC\x99ES LDA', 'LISBOA', > 'Plafond: 2494, Prazo de Pagamento: 30 Metodo de pagamento: ', '', '', 'RUA > JO\x8eO SILVA,N..4,10A', '', '1900-271', '502216972', '218482733', '1663') > > The problem is with '\x99' :-( > I added this line to the code: > > r = r.replace(u'\x99', u'O') > > But it I get exactly the same Traceback!
My guess is a coding error on your part, otherwise something would have changed...can you show some context in import_pcfcli.py? Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor