> * Saying "turn on buffering" is, IMHO, a reasonable solution if you > can make buffering the default in PHP under httpd-2.0. Otherwise, > you'll surprise a lot of users who have been running with the default > non-buffered output using 1.3 and find that all their applications > are far slower with 2.0.
We could turn on buffering for 2.0. I just verified that this does indeed create a single 1024-byte bucket for my 1024-byte file test case. And combined with compiling PHP non-threaded for the prefork mpm the result is: Concurrency Level: 5 Time taken for tests: 115.406395 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 63250000 bytes HTML transferred: 51200000 bytes Requests per second: 433.25 [#/sec] (mean) Time per request: 11.541 [ms] (mean) Time per request: 2.308 [ms] (mean, across all concurrent requests) Transfer rate: 535.21 [Kbytes/sec] received Up from 397 requests/second but still nowhere near the 615 requests/second for Apache 1.3. But, doing this buffering internally in PHP and then again in Apache doesn't seem efficient to me, and the numbers would seem to reflect this inefficiency. > * A better solution, though, would be to have the PHP filter generate > flush buckets (in nonbuffered mode) only when it reaches a "<%" or > "%>". I.e., if the input file has 20KB of static text before the > first embedded script, send that entire 20KB in a bucket, and don't > try to split it into 400-byte segments. If mod_php is in nonbuffered > mode, send an apr_bucket_flush right after it. (There's a precedent > for this approach: one of the ways in which we managed to get good > performance from mod_include in 2.0 was to stop trying to split large > static blocks into small chunks. We were originally concerned about > the amount of time it would take for the mod_include lexer to run > through large blocks of static content, but it hasn't been a problem > in practice.) > > From a mod_php perspective, would either of those be a viable solution? I think Andi is working on this. But, just to test the theory, I modified the PHP lexer to use larger chunks. 1024 in this case. So, the 1k.php test case which looks like this: <html> <head><title>Test Document.</title> <body> <h1>Test Document.</h1> <p> <?='This is a 1024 byte HTML file.'?><br /> aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa<br /> bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb<br /> cccccccccccccccccccccccccccccc<br /> dddddddddddddddddddddddddddddd<br /> eeeeeeeeeeeeeeeeeeeeeeeeeeeeee<br /> ffffffffffffffffffffffffffffff<br /> gggggggggggggggggggggggggggggg<br /> hhhhhhhhhhhhhhhhhhhhhhhhhhhhhh<br /> iiiiiiiiiiiiiiiiiiiiiiiiiiiiii<br /> jjjjjjjjjjjjjjjjjjjjjjjjjjjjjj<br /> kkkkkkkkkkkkkkkkkkkkkkkkkkkkkk<br /> llllllllllllllllllllllllllllll<br /> mmmmmmmmmmmmmmmmmmmmmmmmmmmmmm<br /> nnnnnnnnnnnnnnnnnnnnnnnnnnnnnn<br /> oooooooooooooooooooooooooooooo<br /> pppppppppppppppppppppppppppppp<br /> qqqqqqqqqqqqqqqqqqqqqqqqqqqqqq<br /> rrrrrrrrrrrrrrrrrrrrrrrrrrrrrr<br /> ssssssssssssssssssssssssssssss<br /> tttttttttttttttttttttttttttttt<br /> uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu<br /> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvv<br /> wwwwwwwwwwwwwwwwwwwwwwwwwwwwww<br /> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx<br /> </p> </body> </html> Was split up into 3 buckets. 1. 78 bytes containing: <html> <head><title>Test Document.</title> <body> <h1>Test Document.</h1> <p> 2. 30 bytes containing (because this was dynamically generated) This is a 1024 byte HTML file. 3. A 916 byte bucket containing the rest of the static text. Result: Concurrency Level: 5 Time taken for tests: 124.456357 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 63250000 bytes HTML transferred: 51200000 bytes Requests per second: 401.75 [#/sec] (mean) Time per request: 12.446 [ms] (mean) Time per request: 2.489 [ms] (mean, across all concurrent requests) Transfer rate: 496.29 [Kbytes/sec] received So slower than the single 1024 byte bucket and actually also slower than the 400-byte case, so an invalid test. There are probably some other memory-allocation changes I would need to make to make this a valid test. -Rasmus