Re: [Tutor] Problems with encoding

Kent Johnson Wed, 26 Jul 2006 14:04:59 -0700

[EMAIL PROTECTED] wrote:
> Hi,
>
>
> My interpreter in set via sitecustomize.py to use utf-8 as default encoding.
>
> I'm reading fields from a dbf table to a firebird db with encoding set to 
> win1252.
> I guess it's original encoding is cp850, but am not sure, and have been 
> addressing exceptions one by one with lines of:
>
> r = r.replace(u'offending_code', u'ok_code')
>   
Why don't you just convert from cp850 to cp1252 directly? Python 
supports both encodings, it's as simple as
some_string.decode('cp850').encode('cp1252')
> But now it seems I've run into a brick wall!
> I've this exception I can't seem to avoid with my strategy:
> """
> Traceback (most recent call last):
>   File 
> "C:\Python24\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", 
> line 307, in RunScript
>     debugger.run(codeObject, __main__.__dict__, start_stepping=0)
>   File "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\__init__.py", 
> line 60, in run
>     _GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
>   File "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\debugger.py", 
> line 631, in run
>     exec cmd in globals, locals
>   File "C:\Documents and 
> Settings\mel.TECNICON\Desktop\Statistics\fromDBF\import_pcfcli.py", line 216, 
> in ?
>     fbCr.execute(sqlTemplate, rec)
>   File "C:\Python24\Lib\site-packages\kinterbasdb\typeconv_text_unicode.py", 
> line 108, in unicode_conv_in
>     return unicodeString.encode(pyEncodingName)
>   File "C:\Python24\Lib\encodings\cp1252.py", line 18, in encode
>     return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeDecodeError: 'utf8' codec can't decode byte 0x99 in position 33: 
> unexpected code byte
> """
>
> I also print the offending record so I can inspect it and apply my strategy:
> ('', 'A', '', '', 'EDIFICIOS 3B,SOCIEDADE DE CONTRUC\x99ES LDA', 'LISBOA', 
> 'Plafond: 2494, Prazo de Pagamento: 30 Metodo de pagamento: ', '', '', 'RUA 
> JO\x8eO SILVA,N..4,10A', '', '1900-271', '502216972', '218482733', '1663')
>
> The problem is with '\x99' :-(
> I added this line to the code:
>
> r = r.replace(u'\x99', u'O')
>
> But it I get exactly the same Traceback!


My guess is a coding error on your part, otherwise something would have 
changed...can you show some context in import_pcfcli.py?

Kent

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problems with encoding

Reply via email to