Alexis Marrero wrote:
The next test that I will run this against will be with an obscene amount of data for which this improvement helps a lot!

The dumb thing is the checking for boundaries.

I'm using http "chunked" encoding to access a raw TAPE device through HTTP with python (it GETs or POSTs the raw data as body, each chunk coresponds to a tape block). It blazes the data through at the max network speed with hardly any CPU usage. This HTTP upload code uses 100% CPU while running on my 3GHz box.

The looking-for-line-ends and mime boundaries method is very inefficient compared to that. They oughta have put a "content-length" into every chunk header, and we wouldn't have had this problem in the first place.

I think the only realistic way to improve performance is to read the client input in binary chunks, and then looking for '\r\n---boundary' strings in the chunk using standard string functions. Most of the CPU time is now spent in the readline() call.

This also means revising all the mime body parsing to cope with that... I doubt if that will be worth the effort for anyone.

Reply via email to