Re: [Python-Dev] httplib and bad response chunking
[me, on 25 July] I have discovered other hypothetical cases of bad chunking that cause httplib to go into an infinite loop or block forever on socket.readline(). Should we worry about those cases as well, despite not having seen them happen in the wild? More annoying, I can reproduce the block forever case using a real socket, but not using the StringIO-based FakeSocket class in test_httplib. [John J Lee] They have been seen in the wild :-) http://python.org/sf/1411097 Thanks -- that was really all the encouragement I needed to keep banging away at this bug. Did you look at the crude attempt at testing for this bug that I hacked into test_httplib.py? I posted it to bug #1486335 here: http://sourceforge.net/tracker/download.php?group_id=5470atid=105470file_id=186245aid=1486335 The idea is simple: put various chunked responses into strings and then feed those strings to HTTPConnection. The trouble is that StringIO does not behave the same as a real socket: where HTTPResponse fails one way reading from a real socket (eg. infinite loop), it fails differently (or not at all) reading from a StringIO. Makes testing with the FakeSocket class in test_httplib.py problematic. Maybe the right way to test httplib is to spawn a server process (thread?) to listen on some random port, feed various HTTP responses at HTTPConnection/HTTPResponse, and see what happens. I'm not sure how to do that portably, though. Well, I'll see if I can whip up a Unix-y solution and see if anyone knows how to make it portable. Greg -- Greg Ward [EMAIL PROTECTED] http://www.gerg.ca/ Be careful: sometimes, you're only standing on the shoulders of idiots. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] httplib and bad response chunking
On Tue, Jul 25, 2006 at 10:32:13PM -0400, Greg Ward wrote: what I discovered in the wild the other day was a response like this: 0005\r\nabcd\n\r\n0004\r\nabc\n\r\n\r\n i.e. the chunk-size for the terminating empty chunk was missing. This cause httplib.py to blow up with ValueError because it tried to call int(line, 16) assuming that 'line' contained a hex number, when in fact it was the empty string. Oops. IMHO the minimal fix is to turn ValueError into HTTPException (or a subclass thereof); httplib should not raise ValueError just because some server sends a bad response. (The server in question was Apache 2.0.52 running PHP 4.3.9 sending a big hairy error page because the database was down.) IMNSHO httplib should be fixed and this shouldn't be an error at all as its in the wild and will only show up more and more in the future. Plus file a bug with the apache or php project as appropriate for having a non-RFC compliant response. This is part of the good old network programming addage of being lenient in what you accept. -g ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] httplib and bad response chunking
So I accidentally discovered the other day that httplib does not handle a particular type of mangled HTTP response very well. In particular, it tends to blow up with an undocumented ValueError when the server screws up chunked encoding. I'm not the first to discover this, either: see http://www.python.org/sf/1486335 . digression HTTP 1.1 response chunking allows clients to know how many bytes of response to expect for dynamic content, i.e. when it's not possible to include a Content-length header. A chunked response might look like this: 0005\r\nabcd\n\r\n0004\r\nabc\n\r\n0\r\n\r\n which means: 0x0005 bytes in first chunk, which is abcd\n 0x0004 bytes in second chunk, which is abc\n Each chunk size is terminated with \r\n; each chunk is terminated with \r\n; end of response is indicated by a chunk of 0 bytes, hence the \r\n\r\n at the end. Details in RFC 2616: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1 /digression Anyways, what I discovered in the wild the other day was a response like this: 0005\r\nabcd\n\r\n0004\r\nabc\n\r\n\r\n i.e. the chunk-size for the terminating empty chunk was missing. This cause httplib.py to blow up with ValueError because it tried to call int(line, 16) assuming that 'line' contained a hex number, when in fact it was the empty string. Oops. IMHO the minimal fix is to turn ValueError into HTTPException (or a subclass thereof); httplib should not raise ValueError just because some server sends a bad response. (The server in question was Apache 2.0.52 running PHP 4.3.9 sending a big hairy error page because the database was down.) Where I'm getting hung up is how far to test this stuff. I have discovered other hypothetical cases of bad chunking that cause httplib to go into an infinite loop or block forever on socket.readline(). Should we worry about those cases as well, despite not having seen them happen in the wild? More annoying, I can reproduce the block forever case using a real socket, but not using the StringIO-based FakeSocket class in test_httplib. Anyways, I've cobbled together a crude hack to test_httplib.py that exposes the problem: http://sourceforge.net/tracker/download.php?group_id=5470atid=105470file_id=186245aid=1486335 Feedback welcome. (Fixing the inadvertent ValueError is trivial, so I'm concentrating on getting the tests right first.) Oh yeah, my patch is relative to the 2.4 branch. Greg -- Greg Ward [EMAIL PROTECTED] http://www.gerg.ca/ I don't believe there really IS a GAS SHORTAGE.. I think it's all just a BIG HOAX on the part of the plastic sign salesmen -- to sell more numbers!! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] httplib and bad response chunking
On Tue, 25 Jul 2006, Greg Ward wrote: [...] Where I'm getting hung up is how far to test this stuff. Stop when you run out of time ;-) I have discovered other hypothetical cases of bad chunking that cause httplib to go into an infinite loop or block forever on socket.readline(). Should we worry about those cases as well, despite not having seen them happen in the wild? More annoying, I can reproduce the block forever case using a real socket, but not using the StringIO-based FakeSocket class in test_httplib. They have been seen in the wild :-) http://python.org/sf/1411097 The IP address referenced isn't under my control, I don't know if it still provokes the error, but the problem is clear. John ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com