On Fri, 14 Nov 2008 14:57:42 +0100, Gilles Ganault wrote: > On Fri, 14 Nov 2008 11:01:27 +0100, "Martin v. Löwis" > <[EMAIL PROTECTED]> wrote: >>Add >> print type(output) >>here. If it says "unicode", reconsider the next line >> >>> print output.decode('utf-8') > > In case the string fetched from a web page turns out not to be Unicode > and Python isn't happy, what is the right way to handle this, know what > codepage is being used?
How do you fetch the data? If you simply download it with `urllib` or `urllib` you never get `unicode` but ordinary `str`\s. The you have to figure out the encoding by looking at the headers from the server and/or looking at the fetched data if it contains hints. And when ``print``\ing you should explicitly *encode* the data again because sooner or later you will come across a `stdout` where Python can't determine what the process at the other end expects, for example if output is redirected to a file. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list