The 'latin1' part in my proposed code (/usr/lib/python2.6/dist-
packages/gwibber/microblog/twitter.py line 64) should be 'utf8'
m["text"] = unescape(data["text"].encode("utf8"))
The rationale is as follows: htmllib.HTMLParser from function unescape in
/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py line 48 assumes
unicode strings and won't guess character encoding if they're not.
The Twitter API supports UTF-8 [1]. So if the text strings aren't manipulated
along the way they still are in UTF-8.
[1] http://apiwiki.twitter.com/Things-Every-Developer-Should-
Know#7Encodingaffectsstatuscharactercount
--
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0:
unexpected end of data
https://bugs.launchpad.net/bugs/605543
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs