Trouble Encoding

2005-06-07 Thread fingermark
I'm using feedparser to parse the following:

Adv: Termite Inspections! Jenny Moyer welcomes
you to her HomeFinderResource.com TM A "MUST See …

I'm receiveing the following error when i try to print the feedparser
parsing of the above text:

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in
position 86: ordinal not in range(256)

Why is this happening and where does the problem lie?

thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Trouble Encoding

2005-06-07 Thread fingermark
why is it even trying latin-1 at all?  I don't see it anywhere in
feedparser.py or my code.

deelan wrote:
> [EMAIL PROTECTED] wrote:
> > I'm using feedparser to parse the following:
> >
> > Adv: Termite Inspections! Jenny Moyer welcomes
> > you to her HomeFinderResource.com TM A "MUST See …
> >
> > I'm receiveing the following error when i try to print the feedparser
> > parsing of the above text:
> >
> > UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in
> > position 86: ordinal not in range(256)
> >
> > Why is this happening and where does the problem lie?
>
> it seems that the unicode character 0x201c isn't part
> of the latin-1 charset, see:
>
> "LEFT DOUBLE QUOTATION MARK"
> 
>
> try to encode the feedparser output to UTF-8 instead, or
> use the "replace" option for the encode() method.
>
>  >>> c = u'\u201c'
>  >>> c
> u'\u201c'
>  >>> c.encode('utf-8')
> '\xe2\x80\x9c'
>  >>> print c.encode('utf-8')
>
> ok, let's try replace
>
>  >>> c.encode('latin-1', 'replace')
> '?'
>
> using "replace" will not throw an error, but it will replace
> the offending characther with a question mark.
> 
> HTH.
> 
> -- 
> deelan 

-- 
http://mail.python.org/mailman/listinfo/python-list