On 29Mar2015 21:49, bruce <badoug...@gmail.com> wrote:
Doing a quick/basic pycurl test on a site and trying to convert the returned page to pure ascii.
And if the page cannot be representing in ASCII?
The page has the encoding line <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"> The test uses pycurl, and the StringIO to fetch the page into a str.
Which StringIO? StringIO.StringIO or io.StringIO? In Python 2 the format is effectively bytes (python 2 str) and the latter is unicode (as it is in python 3).
pycurl stuff foo=gg.getBuffer() -at this point, foo has the page in a str buffer. What's happening, is that the test is getting the following kind of error/ UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 20: invalid start byte
Please show us more of the code, preferrably a complete example as small as possible to reproduce the exception. We have no idea what "gg" is or how it was obtained.
The test is using python 2.6 on redhat. I've tried different decode functions based on different sites/articles/stackoverflow but can't quite seem to resolve the issue.
Flailing about on stackoverflow sounds a bit random. Have you consulted the PycURL documentation, especially this page: http://pycurl.sourceforge.net/doc/unicode.html which looks like it ought to discuss your problem. Cheers, Cameron Simpson <c...@zip.com.au> _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor