On Dec 22, 9:05 pm, Christian Heimes <li...@cheimes.de> wrote: > ajaksu schrieb: > > > That said, a "decode to declared HTTP header encoding" version of > > urlopen could be useful to give some users the output they want (text > > from network io) or to make it clear why bytes is the safe way. > > Yeah, your idea sounds both useful and feasible. A patch is welcome! :)
Would monkeypatching what urlopen returns be good enough or should we aim at a cleaner implementation? Glenn, does this sketch work for you? def urlopen_text(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT): response = urlopen(url, data, timeout) _readline = response.readline _readlines = response.readlines _read = response.read charset = response.headers.get_charsets()[0] def readline(limit = -1): content = _readline() return str(content, encoding=charset) response.readline = readline def readlines(hint = None): content = _readlines() return [str(line, encoding=charset) for line in content] response.readlines = readlines def read(n = -1): content = _read() return str(content, encoding=charset) response.read = read return response Any comments/suggestions are very welcome. I could use some help from people that know urllib on the best way to get the charset. Maybe after some sleep I can code it in a less awful way :) Daniel -- http://mail.python.org/mailman/listinfo/python-list