Take this script:

 <? include '/home/rasmus/foo/u2.inc' ?>

So, a simple include with an absolute pathname to avoid any include_path 
searching and all safe_mode and open_basedir checking turned off.  The 
latter of course being an absolute killer.  We still need a lot of system 
calls to handle this include:

getcwd("/home/rasmus", 4096)            = 13
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/rasmus", {st_mode=S_IFDIR|0777, st_size=45056, ...}) = 0
lstat64("/home/rasmus/foo", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/rasmus/foo/u2.inc", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/home/rasmus/foo/u2.inc", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4002c000
_llseek(3, 0, [0], SEEK_CUR)            = 0
lseek(3, 0, SEEK_SET)                   = 0

The 4 lstat64() calls here are due to calling realpath() which causes 
every dir leading up to the filename to be stat'ed to check if it is a 
link so we can normalize the filename before adding it to the included 
files list.  It's handy and nice to normalize the path here, but it is 
really really expensive!  And this overhead doesn't go away with opcode 
caches either as they still need to go through this step to determine if 
the included file has been cached or not.  I also don't see the point of 
those three fstat64() calls that I guess are a result of the sanity 
checking we do after the fopen.

Due to our current implementation, if you have a lot of includes, you
really should put all your include files in / and you will see some
impressive improvements.  This is annoying, but I am not sure how to fix
it.  A couple of things crossed my mind:

 1. Don't normalize the path names, but instead do a single stat on the
    file and record the device and inode of it and use that to match it
    up later to check for multiple inclusion and the various opcode
    caches would use that as a cache key.

 2. Add a fast_include directive, or something equally lame-sounding, 
    which skips the getcwd(), realpath() and sanity checks and just
    includes the file.

 3. For absolute paths, don't call realpath() to normalize, just strip
    out multiple /'s and any .'s or whatever we think is necessary to
    try to normalize it short of having to stat every single dir leading 
    up to the file.

 4. Do some sort of caching on this info so it doesn't happen on every 
    include on every request.

-Rasmus


-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to