> Upon searching for – in google, I came up with this: > http://www.siber-sonic.com/mac/charsetstuff/Soniccharset.html
The character table definitely helps. Thanks. Some additional googling suggests that I need to unescape HTML entities. I'm planning to try the below approach from Frederik Lundh. It relies on the "re" and "htmlentitydefs" modules. http://effbot.org/zone/re-sub.htm#unescape-html I'll report back with my results. Meantime, I welcome any other suggestions. Thanks! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor