Thanks for all the replies. I just got in to work so I haven't tried
any of them yet. I see that I wasn't as clear as I should have been so
I'll clarify a little. I'm grabbing some data from msn's rss feed.
Here's an example.
http://search.msn.com/results.aspx?q=domain+name&format=rss&FORM=ZZRE

The string ' all domain name extensions     � Good' is where I have a
problem. The
'    �' shows up as  '�  �  »' when I write it to a file or stick
it in mysql. I did a hex dump and this is what I see.

[EMAIL PROTECTED]:~/scripts> cat test.txt
extensions     � Good
[EMAIL PROTECTED]:~/scripts> xxd test.txt
0000000: 6578 7465 6e73 696f 6e73 20c2 a020 c2a0  extensions .. ..
0000010: 20c2 bb20 476f 6f64 0a                    .. Good

One thing that jumps out is that two of the �'s are c2a0, but one of
them is c2bb. Well, those are the details since I wasn't clear before.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to