[Bug 605543] Re: UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data

Foppe Hemminga Sat, 07 Aug 2010 08:36:09 -0700

The 'latin1' part in my proposed code (/usr/lib/python2.6/dist-
packages/gwibber/microblog/twitter.py line 64) should be 'utf8'


   m["text"] = unescape(data["text"].encode("utf8"))

The rationale is as follows: htmllib.HTMLParser from function unescape in 
/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py line 48 assumes 
unicode strings and won't guess character encoding if they're not. 
The Twitter API supports UTF-8 [1]. So if the text strings aren't manipulated 
along the way they still are in UTF-8.

[1] http://apiwiki.twitter.com/Things-Every-Developer-Should-
Know#7Encodingaffectsstatuscharactercount

-- 
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: 
unexpected end of data
https://bugs.launchpad.net/bugs/605543
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 605543] Re: UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data

Reply via email to