I don't have an answer for why Python might be mis-handling the data, but wanted to make a factual correction:
[EMAIL PROTECTED] writes: > Some web feeds use decimal character entities that seem to confuse > Python (or me). For example, the string "doesn't" may be coded as > "doesn’t" which should produce a right leaning apostrophe. That character isn't a "right leaning apostrophe"; it has nothing to do with apostrophes. It is the character called "right single quotation mark" in <URL:http://www.w3.org/TR/html4/sgml/entities.html> and in Unicode (code point U+2019). It's a typographical error to use a quotation mark as an apostrophe. Use the apostrophe character (U+0027) where an apostrophe is intended, and quotation mark characters where those are intended. This is directed, of course, at the person generating that output. -- \ “If you go to a costume party at your boss's house, wouldn't | `\ you think a good costume would be to dress up like the boss's | _o__) wife? Trust me, it's not.” —Jack Handey | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list