Re: Encode exception for chinese text

Serge Orlov Fri, 19 May 2006 05:00:48 -0700

Vinayakc wrote:
> Hi all,
>
> I am new to python.
>
> I have written one small application which reads data from xml file and
> tries to encode data using apprpriate charset.
> I am facing problem while encoding one chinese paragraph with charset
> "gb2312".
>
> code is:
>
> encoded_str = str_data.encode("gb2312")
>
> The type of str_data is <type 'unicode'>
>
> The exception is:
>
> "UnicodeEncodeError: 'gb2312' codec can't encode character u'\xa0' in
> position 0: illegal multibyte sequence"


Hmm, this is 'no-break space' in the very beginning of the text. It
look suspiciously like a  plain text utf-8 signature which is 'zero
width no-break space'. If you strip the first character do you still
have encoding errors?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Encode exception for chinese text

Reply via email to