On May 10, 2008, at 1:17 AM, Brian Eaton wrote:
(Warning: not a PHP user, likely to be speaking nonsense.) It sounds
like something you're doing is screwing up the kernel's I/O cache.
Well the kernel file system cache is what makes it survivable in the
first place, without putting the entire features structure on disk and
reading it on on each page request, it is about half the speed as with.
Luckily file_get_contents() (the operation used to read the cache
file) already uses memory mapping techniques, so the reading of the
information should be relatively optimal.
I got a feeling it's mostly PHP being a bit stressed about having to
absorb so much information and parse the serialized information
(serialize(<some object>) creases a json like structure with class
names and values) and ramming it into its internal virtual machine so
many times a second. I'd currently guess that that is where a the
bottleneck is.
What you *really* want is for each process to read from a file and
stream to a socket. The kernel will do a good job of figuring out
that those files ought to be kept in memory, so the individual
processes don't pay for any disk I/O as they work. Because the files
are being streamed in chunks it doesn't take much memory per-process
either.
As i was going to type that readfile() did that and i'll give that a
shot and see if that makes a difference, when i just saw Kevin's reply
coming in saying the same, so i won't repeat that :)
Optimally i would probably prefer to have a separate (multithreaded or
socket select()'ing and chunking) daemon for this that just had all
the features stuff in memory and go from there, but that would kind of
defeat the point of having a 'simple' php implementation :)
It is not uncommon for the kernel to do a better job of
caching data from disk than an HTTP server can do.
Something i often use religiously (including in shindig, all it's
default caching is based on this very principle) , strangely enough a
bunch of kernel developers writing an optimized algorithm in C do a
bit of a better job then something hacked together in PHP :-)
About the 420 pages/sec vs 630 pages/sec number... what are you
counting as a page?
Thats purely a synthetic, localhost, benchmark using apache bench
(ab). Hence the explicit 'synthetic' (since it doesn't deal with the
'real world' of the internet), but it still gives something of a
measuring point to compare different implementations performance with :)
-- Chris