Edit report at http://bugs.php.net/bug.php?id=51330&edit=1
ID: 51330 Updated by: [email protected] Reported by: liren dot chen at gmail dot com Summary: file_get_contents for url hangs -Status: Open +Status: Wont fix Type: Bug Package: HTTP related Operating System: win and linux PHP Version: 5.2.13 New Comment: This looks like a problem with the Al-Jazeera Web serving infrastructure rather than a PHP issue. Basically, it's responding to a HTTP 1.0 request with no Connection header with a HTTP 1.1 response with a "Connection: keep-alive" header, which is counter to section 8.1.2.1 of RFC 2616, which states: "Clients and servers SHOULD NOT assume that a persistent connection is maintained for HTTP versions less than 1.1 unless it is explicitly signaled. See section 19.6.2 for more information on backward compatibility with HTTP/1.0 clients." It's a SHOULD NOT, not a MUST NOT, but nevertheless it doesn't line up with normal HTTP server behaviour out there on the 'net, and there's no support for handling keep-alive connections within the HTTP wrapper. The problem can be worked around from within PHP by sending an explicit Connection request header like so: <?php $options = array( 'http' => array( 'header' => 'Connection: close' ) ); $ctx = stream_context_create($options); $content = file_get_contents('http://english.aljazeera.net/', false, $ctx); ?> Since php_stream_url_wrap_http_ex() downloads the entire response from the connection (and blocks until the socket is closed) before parsing the HTTP response (including headers), it would require a considerable amount of work to refactor the code to allow the Content-Length header to be examined. This _might_ be something we could look at post-5.3, but I think this is best closed for now. Previous Comments: ------------------------------------------------------------------------ [2010-03-19 10:35:37] liren dot chen at gmail dot com Description: ------------ file_get_contents("http://english.aljazeera.net/") hangs until timeout. I did a debug and found that file_get_contents doesn't check the "content-length" header. If you do this way: file_get_contents("http://english.aljazeera.net/", NULL, NULL, 0, 1); then get "content-length" from the response header, then file_get_contents("http://english.aljazeera.net/", NULL, NULL, 0, the-value-from-content-length); then it works. Apparently we should close the socket when we read content-length bytes from the socket and return. Test script: --------------- file_get_contents("http://english.aljazeera.net/") ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=51330&edit=1
