Simon Cross wrote:
Well, since the source for _read_chunked includes the comment

        # XXX This accumulates chunks by repeated string concatenation,
        # which is not efficient as the number or size of chunks gets big.

you might gain some speed improvement with minimal effort by gathering
the read data chunks into a list and then returning "".join(chunks) at
the end.

True, I'll be trying that and reporting back, but, more interestingly, I did some analysis with wireshark (only 200MB-odd of .pcap logs that was fun ;-) to see the differences in the http conversation and noticed more interestingness...

So, httplib does this:

GET /<blah> HTTP/1.1
Host: <blah>
Accept-Encoding: identity
Authorization: Basic <blah>

HTTP/1.1 200 OK
Date: Fri, 04 Sep 2009 11:44:22 GMT
Server: Apache-Coyote/1.1
ContentLength: 116245504
Content-Type: application/vnd.excel
Transfer-Encoding: chunked

While wget does this:

<snip 401 conversation>
GET /<blah> HTTP/1.0
User-Agent: Wget/1.11.4
Accept: */*
Host: <blah>
Connection: Keep-Alive
Authorization: Basic <blah>

HTTP/1.1 200 OK
Date: Fri, 04 Sep 2009 11:35:19 GMT
Server: Apache-Coyote/1.1
ContentLength: 116245504
Content-Type: application/vnd.excel
Connection: close

Interesting points:

- Apache in this instance responds with HTTP 1.1, even though the wget request was 1.0, is that allowed?

- Apache responds with a chunked response only to httplib. Why is that?

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
           - http://www.simplistix.co.uk
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to