Take this script:
<? include '/home/rasmus/foo/u2.inc' ?>
So, a simple include with an absolute pathname to avoid any include_path
searching and all safe_mode and open_basedir checking turned off. The
latter of course being an absolute killer. We still need a lot of system
calls to handle this include:
getcwd("/home/rasmus", 4096) = 13
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/rasmus", {st_mode=S_IFDIR|0777, st_size=45056, ...}) = 0
lstat64("/home/rasmus/foo", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/rasmus/foo/u2.inc", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/home/rasmus/foo/u2.inc", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4002c000
_llseek(3, 0, [0], SEEK_CUR) = 0
lseek(3, 0, SEEK_SET) = 0
The 4 lstat64() calls here are due to calling realpath() which causes
every dir leading up to the filename to be stat'ed to check if it is a
link so we can normalize the filename before adding it to the included
files list. It's handy and nice to normalize the path here, but it is
really really expensive! And this overhead doesn't go away with opcode
caches either as they still need to go through this step to determine if
the included file has been cached or not. I also don't see the point of
those three fstat64() calls that I guess are a result of the sanity
checking we do after the fopen.
Due to our current implementation, if you have a lot of includes, you
really should put all your include files in / and you will see some
impressive improvements. This is annoying, but I am not sure how to fix
it. A couple of things crossed my mind:
1. Don't normalize the path names, but instead do a single stat on the
file and record the device and inode of it and use that to match it
up later to check for multiple inclusion and the various opcode
caches would use that as a cache key.
2. Add a fast_include directive, or something equally lame-sounding,
which skips the getcwd(), realpath() and sanity checks and just
includes the file.
3. For absolute paths, don't call realpath() to normalize, just strip
out multiple /'s and any .'s or whatever we think is necessary to
try to normalize it short of having to stat every single dir leading
up to the file.
4. Do some sort of caching on this info so it doesn't happen on every
include on every request.
-Rasmus
--
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php