Re: [PHP] include selectively or globally?

2012-08-28 Thread tamouse mailing lists
What do your performance measurements show so you have actual data
comparisons to make a valid decsion?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread Matijn Woudt
On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete
 halukkaram...@gmail.com wrote:

 Now, the question is... should you use a global include that points to
 this library - across the board - so that ALL the pages ( including
 the 90% that do not need the library ) will get it, or should you
 selectively add that include reference only on the pages you need?


 Since searching for files is one of the most expensive (in time)
 operations, you're probably best off with only a single PHP file.

 Maybe I misinterpreted the question, but I don't think I agree.

 If you have a 50K PHP file that's only needed in only 10% of the
 pages, then, when solely considering performance, that file should
 only be included on the 10% of the pages that actually use the file.
 Now, there are reasons where you might want to include the file
 globally (maintenance purposes, etc.) Loading the 50K of PHP code
 requires building up all of the associated infrastructure (zvals,
 etc.) for the user code (even if APC is used, the cached opcode/PHP
 bytecode still has to be parsed and built up for the user-defined
 classes and functions per request, even if they're unused), is
 certainly going to perform more slowly than selectively including the
 library on only the pages that need the library.

 Adam


First of all, I believe PHP is smart enough to not generate bytecode
for functions that are not used in the current file. Think about the
fact that you can write a function with errors, which will run fine
until you call the function. (except for syntax errors).

The speed difference between loading 5K file or 50K file (assuming
continuous blocks) is extremely small. If you split this library, you
would have PHP files that require you to load maybe 3 or 4 different
files to have all their functions. This would require 3 or 4 more file
searches, first the file needs to be located in the file table, then
on the disk. If you compare the required time for those operations,
they are enormous compared to time needed for a bigger file.
Just for the facts, if you're on a high end server drive (15000RPM
with 120MB/s throughput), you would have an average access time of
7ms. (rotational and seek time). Loading 5k with 120MB/s thereafter
only takes 0.04ms. 50k would take 0.4ms. That would save you 0.36ms if
a file only needs 1 include, if you need 2, that would cost you 6.68
ms. 3 would cost 13.72 ms, etc. With an 3.8GHz CPU, there are approx
4.000.000 clock cycles in 1ms, so in this case you would lose for only
loading 2 files instead of one, approx 27.250.000 clock cycles.. Think
about what PHP could do with all those clock cycles..

- Matijn

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread David Harkness
On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote:

 First of all, I believe [A] PHP is smart enough to not generate bytecode
 for functions that are not used in the current file. Think about the
 fact that you can write a function with errors, which will run fine
 until you call the function. [B] (except for syntax errors).


 [B] negates [A]. PHP must either parse the file into opcodes or load them
from APC and further execute the top-level opcodes. That means defining
functions (not calling them unless called directly), constants, global
variables, classes, etc. No amount of measuring is required to tell me that
doing X vs. not doing X in this case clearly takes longer.

Now, is that time significant enough to warrant the extra logic required?
In my case, absolutely. We organize our library into many classes in
multiple files. By using an autoloader, we simply don't need to think about
it. Include bootstrap.php which sets up the autoloader and include paths.
Done.

In the case with a single 50k library file that is used on 10% of the
pages, I'd absolutely require_once it only in the pages that need it
without measuring the performance. It's so trivial to maintain that single
include in individual pages that the gain on 90% of the pages is not worth
delving deeper.

Peace,
David


Re: [PHP] include selectively or globally?

2012-08-28 Thread Adam Richardson
On Tue, Aug 28, 2012 at 7:39 AM, Matijn Woudt tijn...@gmail.com wrote:
 On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete
 halukkaram...@gmail.com wrote:

 First of all, I believe PHP is smart enough to not generate bytecode
 for functions that are not used in the current file. Think about the
 fact that you can write a function with errors, which will run fine
 until you call the function. (except for syntax errors).

I believe this is untrue. PHP generates the bytecode and then parses
the bytecode per request to generate the userland infrastructure,
including classes and functions, for the entire include file. During
the generation of bytecode, PHP doesn't know apriori which functions
will be called at runtime. I suspect if you asked for confirmation of
this on the internals list, they'd confirm this. In terms of errors,
there are certainly different stages that errors can occur, and what
you're referring to are runtime errors. Runtime errors don't
necessarily show up in every possible execution branch. That doesn't
mean that PHP didn't generate the code for the userland functionality.

 The speed difference between loading 5K file or 50K file (assuming
 continuous blocks) is extremely small. If you split this library, you
 would have PHP files that require you to load maybe 3 or 4 different
 files to have all their functions.

Here's where I believe we have a communication issue. I never spoke of
splitting up the library into 3 or 4, or any number of different
files. The opening post states that only 10% of the pages need the
library. I suggested that he only include the library in the 10% of
the pages that need the library. That said, it's possible I
misinterpreted him.

I will say that I do disagree with your analysis that difference
between loading a 5K or 50K php file is extremely small. So I just put
this to the test.

I created a 5K file and a 50K file, both of which have the form:

function hello1(){
echo hello again;
}

function hello2(){
echo hello again;
}

etc.

I have XDegub installed, have APC running, warmed the caches, and then
test a few times. There results all hover around the following:

Including the 5K requires around 50 microseconds. Including the 50K
requires around 180 microseconds. The point is that there is a
significant difference due to the work PHP has to do behind the
scenes, even when functions (or classes, etc. are unused.) And,
relevant to the dialog for this current thread, avoiding including an
unused 50K PHP on 90% of the pages (the pages that don't need the
library) will lead to a real difference.

Adam

-- 
Nephtali:  A simple, flexible, fast, and security-focused PHP framework
http://nephtaliproject.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread Matijn Woudt
On Tue, Aug 28, 2012 at 6:55 PM, David Harkness
davi...@highgearmedia.com wrote:
 On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote:

 First of all, I believe [A] PHP is smart enough to not generate bytecode
 for functions that are not used in the current file. Think about the
 fact that you can write a function with errors, which will run fine
 until you call the function. [B] (except for syntax errors).


  [B] negates [A]. PHP must either parse the file into opcodes or load them
 from APC and further execute the top-level opcodes. That means defining
 functions (not calling them unless called directly), constants, global
 variables, classes, etc. No amount of measuring is required to tell me that
 doing X vs. not doing X in this case clearly takes longer.

[B] does not negate [A]. There's a difference between parsing the
syntax and defining functions, classes constants and globals, and
generating bytecode. In a 'normal' file I guess syntax definitions are
only about 5% of the total contents, the rest can be ignored until
being called.


 Now, is that time significant enough to warrant the extra logic required? In
 my case, absolutely. We organize our library into many classes in multiple
 files. By using an autoloader, we simply don't need to think about it.
 Include bootstrap.php which sets up the autoloader and include paths. Done.

 In the case with a single 50k library file that is used on 10% of the pages,
 I'd absolutely require_once it only in the pages that need it without
 measuring the performance. It's so trivial to maintain that single include
 in individual pages that the gain on 90% of the pages is not worth delving
 deeper.

 Peace,
 David


Let me quote the OP, I think that suffices:
When answering this question, please approach the matter strictly from
a caching/performance point of view, not from a convenience point of
view just to avoid that the discussion shifts to a programming style
and the do's and don'ts.

- Matijn

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread Matijn Woudt
On Tue, Aug 28, 2012 at 7:18 PM, Adam Richardson simples...@gmail.com wrote:
 On Tue, Aug 28, 2012 at 7:39 AM, Matijn Woudt tijn...@gmail.com wrote:
 On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com 
 wrote:
 On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete
 halukkaram...@gmail.com wrote:

 First of all, I believe PHP is smart enough to not generate bytecode
 for functions that are not used in the current file. Think about the
 fact that you can write a function with errors, which will run fine
 until you call the function. (except for syntax errors).

 I believe this is untrue. PHP generates the bytecode and then parses
 the bytecode per request to generate the userland infrastructure,
 including classes and functions, for the entire include file. During
 the generation of bytecode, PHP doesn't know apriori which functions
 will be called at runtime. I suspect if you asked for confirmation of
 this on the internals list, they'd confirm this. In terms of errors,
 there are certainly different stages that errors can occur, and what
 you're referring to are runtime errors. Runtime errors don't
 necessarily show up in every possible execution branch. That doesn't
 mean that PHP didn't generate the code for the userland functionality.

 The speed difference between loading 5K file or 50K file (assuming
 continuous blocks) is extremely small. If you split this library, you
 would have PHP files that require you to load maybe 3 or 4 different
 files to have all their functions.

 Here's where I believe we have a communication issue. I never spoke of
 splitting up the library into 3 or 4, or any number of different
 files. The opening post states that only 10% of the pages need the
 library. I suggested that he only include the library in the 10% of
 the pages that need the library. That said, it's possible I
 misinterpreted him.

I interpreted it as: I have a 50K library, and some files only use
10%, some use 20% and some 30%. To be able to include it separately,
you would need to split and some would need to include maybe 3 or 4
files.


 I will say that I do disagree with your analysis that difference
 between loading a 5K or 50K php file is extremely small. So I just put
 this to the test.

 I created a 5K file and a 50K file, both of which have the form:

 function hello1(){
 echo hello again;
 }

 function hello2(){
 echo hello again;
 }

 etc.

 I have XDegub installed, have APC running, warmed the caches, and then
 test a few times. There results all hover around the following:

 Including the 5K requires around 50 microseconds. Including the 50K
 requires around 180 microseconds. The point is that there is a
 significant difference due to the work PHP has to do behind the
 scenes, even when functions (or classes, etc. are unused.) And,
 relevant to the dialog for this current thread, avoiding including an
 unused 50K PHP on 90% of the pages (the pages that don't need the
 library) will lead to a real difference.

 Adam

Finally, you're the first one that actually has measured something.
You should redo your test with real world files, because in real world
functions aren't that small.
In functions with more lines (say ~100 lines per function), you'll see
a different ratio between 5k and 50k. In my tests it is:
- 5K: 22ms
- 50K: 34 ms

When I create files that only contain 1 function, with just a number
of echo Hello world; lines until 5k or 50k, the results are:
- 5K: 15 ms
- 50K: 17 ms


Cheers,

Matijn

Ps. Code used:
?php

$time_start = microtime(true);

include '5k.php'; // 5k.php or 50k.php

$time_end = microtime(true);
echo ($time_end - $time_start).s;

?

System specs:
Ubuntu 12.04 LTS with Apache 2.2.22 and PHP 5.3.10 default config with
no cache etc.
AMD Phenom X4 9550 (2.2GHz)
4 GB DDR2-800
Disk where PHP files at: WD 500GB  with average read speed of 79.23
MB/s (as Measured with hdparm)

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread Adam Richardson
On Tue, Aug 28, 2012 at 3:28 PM, Matijn Woudt tijn...@gmail.com wrote:
 On Tue, Aug 28, 2012 at 7:18 PM, Adam Richardson simples...@gmail.com wrote:

 Finally, you're the first one that actually has measured something.
 You should redo your test with real world files, because in real world
 functions aren't that small.

In terms of redoing the test with real world files, that's an
entirely different debate (and one I won't enter into at this time,
though this list has discussed this topic before, most recently in a
post Ted made talking about screen height.)

The point is, there is a real difference. The question remains if the
difference is enough to act on in future code bases (and I would say
yes if my tests showed this difference, you may say no.)

 In functions with more lines (say ~100 lines per function), you'll see
 a different ratio between 5k and 50k. In my tests it is:
 - 5K: 22ms
 - 50K: 34 ms

Those trends/results depend significantly on the contents of the
functions, too. The overly simplistic example we've used both helps
and hurts the analysis (I'll admit my example likely has more
functions than other 5K/50K files, and I suspect most functions
require more complicated work behind the scenes to build up than echo
statements.)

The point I'd make here is that it's very difficult to have apriori
knowledge of how something will perform without testing it.

 When I create files that only contain 1 function, with just a number
 of echo Hello world; lines until 5k or 50k, the results are:
 - 5K: 15 ms
 - 50K: 17 ms

Ummm... sure. What did you say about real world before :)

Have a nice day!

Adam

-- 
Nephtali:  A simple, flexible, fast, and security-focused PHP framework
http://nephtaliproject.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-28 Thread David Harkness
On Tue, Aug 28, 2012 at 12:11 PM, Matijn Woudt tijn...@gmail.com wrote:

 On Tue, Aug 28, 2012 at 6:55 PM, David Harkness
 davi...@highgearmedia.com wrote:
  On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote:
 
  First of all, I believe [A] PHP is smart enough to not generate bytecode
  for functions that are not used in the current file. Think about the
  fact that you can write a function with errors, which will run fine
  until you call the function. [B] (except for syntax errors).
 
   [B] negates [A]. PHP must either parse the file into opcodes or load
 them
  from APC and further execute the top-level opcodes. That means defining
  functions (not calling them unless called directly), constants, global
  variables, classes, etc.

 [B] does not negate [A]. There's a difference between parsing the
 syntax and defining functions, classes constants and globals, and
 generating bytecode. In a 'normal' file I guess syntax definitions are
 only about 5% of the total contents, the rest can be ignored until
 being called.


I won't claim a deep understanding of the PHP internals, but I have enough
experience with varied compiled and interpreted languages and using PHP and
APC that I'm confident that the process to include a file involves:

1. Load the opcodes
A. Either read the file from disk and parse the PHP into opcodes, or
B. Load the cached opcodes from APC.
2. Execute the top-level opcodes

Any syntax errors--even those in unreachable code blocks--will cause the
script to fail parsing. For example,

if (false) {
function foo() {
SYNTAX ERROR!
}
}

will cause the parse to fail even though the function cannot logically be
defined. PHP doesn't even get that far.

PHP Parse error:  syntax error, unexpected T_STRING in php shell code
on line 3


 When answering this question, please approach the matter strictly from
 a caching/performance point of view, not from a convenience point of
 view just to avoid that the discussion shifts to a programming style
 and the do's and don'ts.


While out of convenience you might be tempted to include the file in every
script, when considering performance alone you should include the file only
in those scripts that will make use of its contents.

Peace,
David


[PHP] include selectively or globally?

2012-08-27 Thread Haluk Karamete
With this question, I aim to understand the inner workings of PHP a
little better.

Assume that you got a 50K library. The library is loaded with a bunch
of handy functions that you use here and there. Also assume that these
functions are needed/used by say 10% of the pages of your site. But
your home page definitely needs it.

Now, the question is... should you use a global include that points to
this library - across the board - so that ALL the pages ( including
the 90% that do not need the library ) will get it, or should you
selectively add that include reference only on the pages you need?

Before answering this question, let me point why I ask this question...

When you include that reference, PHP may be caching it. So the
performance hit I worry may be one time deal, as opposed to every
time. Once that one time out of the way, subsequent loads may not be
as bad as one might think. That's all because of the smart caching
mechanisms that PHP deploys - which I do not have a deep knowledge of,
hence the question...

Since the front page needs that library anyway, the argument could be
why not keep that library warm and fresh in the memory and get it
served across the board?

When answering this question, please approach the matter strictly from
a caching/performance point of view, not from a convenience point of
view just to avoid that the discussion shifts to a programming style
and the do's and don'ts.

Thank you

http://stackoverflow.com/questions/12148966/include-selectively-or-globally

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-27 Thread Matijn Woudt
On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete
halukkaram...@gmail.com wrote:
 With this question, I aim to understand the inner workings of PHP a
 little better.

 Assume that you got a 50K library. The library is loaded with a bunch
 of handy functions that you use here and there. Also assume that these
 functions are needed/used by say 10% of the pages of your site. But
 your home page definitely needs it.

 Now, the question is... should you use a global include that points to
 this library - across the board - so that ALL the pages ( including
 the 90% that do not need the library ) will get it, or should you
 selectively add that include reference only on the pages you need?

 Before answering this question, let me point why I ask this question...

 When you include that reference, PHP may be caching it. So the
 performance hit I worry may be one time deal, as opposed to every
 time. Once that one time out of the way, subsequent loads may not be
 as bad as one might think. That's all because of the smart caching
 mechanisms that PHP deploys - which I do not have a deep knowledge of,
 hence the question...

 Since the front page needs that library anyway, the argument could be
 why not keep that library warm and fresh in the memory and get it
 served across the board?

 When answering this question, please approach the matter strictly from
 a caching/performance point of view, not from a convenience point of
 view just to avoid that the discussion shifts to a programming style
 and the do's and don'ts.

 Thank you

 http://stackoverflow.com/questions/12148966/include-selectively-or-globally

Since searching for files is one of the most expensive (in time)
operations, you're probably best off with only a single PHP file. PHP
parses a file initially pretty quickly (it's only checking syntax half
on load), so unless you're having a 100MHz CPU with SSD drive, I'd say
go with a single PHP file. If you make sure the file isn't fragmented
over your disk, it should load pretty quick to memory. I'm not sure if
PHP caches that much, but if you really care, take a look at memcached
or APC.

- Matijn

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] include selectively or globally?

2012-08-27 Thread Adam Richardson
On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote:
 On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete
 halukkaram...@gmail.com wrote:

 Now, the question is... should you use a global include that points to
 this library - across the board - so that ALL the pages ( including
 the 90% that do not need the library ) will get it, or should you
 selectively add that include reference only on the pages you need?


 Since searching for files is one of the most expensive (in time)
 operations, you're probably best off with only a single PHP file.

Maybe I misinterpreted the question, but I don't think I agree.

If you have a 50K PHP file that's only needed in only 10% of the
pages, then, when solely considering performance, that file should
only be included on the 10% of the pages that actually use the file.
Now, there are reasons where you might want to include the file
globally (maintenance purposes, etc.) Loading the 50K of PHP code
requires building up all of the associated infrastructure (zvals,
etc.) for the user code (even if APC is used, the cached opcode/PHP
bytecode still has to be parsed and built up for the user-defined
classes and functions per request, even if they're unused), is
certainly going to perform more slowly than selectively including the
library on only the pages that need the library.

Adam

-- 
Nephtali:  A simple, flexible, fast, and security-focused PHP framework
http://nephtaliproject.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php