Re: [PHP] include selectively or globally?
What do your performance measurements show so you have actual data comparisons to make a valid decsion? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com wrote: On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote: On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete halukkaram...@gmail.com wrote: Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need? Since searching for files is one of the most expensive (in time) operations, you're probably best off with only a single PHP file. Maybe I misinterpreted the question, but I don't think I agree. If you have a 50K PHP file that's only needed in only 10% of the pages, then, when solely considering performance, that file should only be included on the 10% of the pages that actually use the file. Now, there are reasons where you might want to include the file globally (maintenance purposes, etc.) Loading the 50K of PHP code requires building up all of the associated infrastructure (zvals, etc.) for the user code (even if APC is used, the cached opcode/PHP bytecode still has to be parsed and built up for the user-defined classes and functions per request, even if they're unused), is certainly going to perform more slowly than selectively including the library on only the pages that need the library. Adam First of all, I believe PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. (except for syntax errors). The speed difference between loading 5K file or 50K file (assuming continuous blocks) is extremely small. If you split this library, you would have PHP files that require you to load maybe 3 or 4 different files to have all their functions. This would require 3 or 4 more file searches, first the file needs to be located in the file table, then on the disk. If you compare the required time for those operations, they are enormous compared to time needed for a bigger file. Just for the facts, if you're on a high end server drive (15000RPM with 120MB/s throughput), you would have an average access time of 7ms. (rotational and seek time). Loading 5k with 120MB/s thereafter only takes 0.04ms. 50k would take 0.4ms. That would save you 0.36ms if a file only needs 1 include, if you need 2, that would cost you 6.68 ms. 3 would cost 13.72 ms, etc. With an 3.8GHz CPU, there are approx 4.000.000 clock cycles in 1ms, so in this case you would lose for only loading 2 files instead of one, approx 27.250.000 clock cycles.. Think about what PHP could do with all those clock cycles.. - Matijn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote: First of all, I believe [A] PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. [B] (except for syntax errors). [B] negates [A]. PHP must either parse the file into opcodes or load them from APC and further execute the top-level opcodes. That means defining functions (not calling them unless called directly), constants, global variables, classes, etc. No amount of measuring is required to tell me that doing X vs. not doing X in this case clearly takes longer. Now, is that time significant enough to warrant the extra logic required? In my case, absolutely. We organize our library into many classes in multiple files. By using an autoloader, we simply don't need to think about it. Include bootstrap.php which sets up the autoloader and include paths. Done. In the case with a single 50k library file that is used on 10% of the pages, I'd absolutely require_once it only in the pages that need it without measuring the performance. It's so trivial to maintain that single include in individual pages that the gain on 90% of the pages is not worth delving deeper. Peace, David
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 7:39 AM, Matijn Woudt tijn...@gmail.com wrote: On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com wrote: On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote: On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete halukkaram...@gmail.com wrote: First of all, I believe PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. (except for syntax errors). I believe this is untrue. PHP generates the bytecode and then parses the bytecode per request to generate the userland infrastructure, including classes and functions, for the entire include file. During the generation of bytecode, PHP doesn't know apriori which functions will be called at runtime. I suspect if you asked for confirmation of this on the internals list, they'd confirm this. In terms of errors, there are certainly different stages that errors can occur, and what you're referring to are runtime errors. Runtime errors don't necessarily show up in every possible execution branch. That doesn't mean that PHP didn't generate the code for the userland functionality. The speed difference between loading 5K file or 50K file (assuming continuous blocks) is extremely small. If you split this library, you would have PHP files that require you to load maybe 3 or 4 different files to have all their functions. Here's where I believe we have a communication issue. I never spoke of splitting up the library into 3 or 4, or any number of different files. The opening post states that only 10% of the pages need the library. I suggested that he only include the library in the 10% of the pages that need the library. That said, it's possible I misinterpreted him. I will say that I do disagree with your analysis that difference between loading a 5K or 50K php file is extremely small. So I just put this to the test. I created a 5K file and a 50K file, both of which have the form: function hello1(){ echo hello again; } function hello2(){ echo hello again; } etc. I have XDegub installed, have APC running, warmed the caches, and then test a few times. There results all hover around the following: Including the 5K requires around 50 microseconds. Including the 50K requires around 180 microseconds. The point is that there is a significant difference due to the work PHP has to do behind the scenes, even when functions (or classes, etc. are unused.) And, relevant to the dialog for this current thread, avoiding including an unused 50K PHP on 90% of the pages (the pages that don't need the library) will lead to a real difference. Adam -- Nephtali: A simple, flexible, fast, and security-focused PHP framework http://nephtaliproject.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 6:55 PM, David Harkness davi...@highgearmedia.com wrote: On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote: First of all, I believe [A] PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. [B] (except for syntax errors). [B] negates [A]. PHP must either parse the file into opcodes or load them from APC and further execute the top-level opcodes. That means defining functions (not calling them unless called directly), constants, global variables, classes, etc. No amount of measuring is required to tell me that doing X vs. not doing X in this case clearly takes longer. [B] does not negate [A]. There's a difference between parsing the syntax and defining functions, classes constants and globals, and generating bytecode. In a 'normal' file I guess syntax definitions are only about 5% of the total contents, the rest can be ignored until being called. Now, is that time significant enough to warrant the extra logic required? In my case, absolutely. We organize our library into many classes in multiple files. By using an autoloader, we simply don't need to think about it. Include bootstrap.php which sets up the autoloader and include paths. Done. In the case with a single 50k library file that is used on 10% of the pages, I'd absolutely require_once it only in the pages that need it without measuring the performance. It's so trivial to maintain that single include in individual pages that the gain on 90% of the pages is not worth delving deeper. Peace, David Let me quote the OP, I think that suffices: When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts. - Matijn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 7:18 PM, Adam Richardson simples...@gmail.com wrote: On Tue, Aug 28, 2012 at 7:39 AM, Matijn Woudt tijn...@gmail.com wrote: On Tue, Aug 28, 2012 at 3:49 AM, Adam Richardson simples...@gmail.com wrote: On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote: On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete halukkaram...@gmail.com wrote: First of all, I believe PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. (except for syntax errors). I believe this is untrue. PHP generates the bytecode and then parses the bytecode per request to generate the userland infrastructure, including classes and functions, for the entire include file. During the generation of bytecode, PHP doesn't know apriori which functions will be called at runtime. I suspect if you asked for confirmation of this on the internals list, they'd confirm this. In terms of errors, there are certainly different stages that errors can occur, and what you're referring to are runtime errors. Runtime errors don't necessarily show up in every possible execution branch. That doesn't mean that PHP didn't generate the code for the userland functionality. The speed difference between loading 5K file or 50K file (assuming continuous blocks) is extremely small. If you split this library, you would have PHP files that require you to load maybe 3 or 4 different files to have all their functions. Here's where I believe we have a communication issue. I never spoke of splitting up the library into 3 or 4, or any number of different files. The opening post states that only 10% of the pages need the library. I suggested that he only include the library in the 10% of the pages that need the library. That said, it's possible I misinterpreted him. I interpreted it as: I have a 50K library, and some files only use 10%, some use 20% and some 30%. To be able to include it separately, you would need to split and some would need to include maybe 3 or 4 files. I will say that I do disagree with your analysis that difference between loading a 5K or 50K php file is extremely small. So I just put this to the test. I created a 5K file and a 50K file, both of which have the form: function hello1(){ echo hello again; } function hello2(){ echo hello again; } etc. I have XDegub installed, have APC running, warmed the caches, and then test a few times. There results all hover around the following: Including the 5K requires around 50 microseconds. Including the 50K requires around 180 microseconds. The point is that there is a significant difference due to the work PHP has to do behind the scenes, even when functions (or classes, etc. are unused.) And, relevant to the dialog for this current thread, avoiding including an unused 50K PHP on 90% of the pages (the pages that don't need the library) will lead to a real difference. Adam Finally, you're the first one that actually has measured something. You should redo your test with real world files, because in real world functions aren't that small. In functions with more lines (say ~100 lines per function), you'll see a different ratio between 5k and 50k. In my tests it is: - 5K: 22ms - 50K: 34 ms When I create files that only contain 1 function, with just a number of echo Hello world; lines until 5k or 50k, the results are: - 5K: 15 ms - 50K: 17 ms Cheers, Matijn Ps. Code used: ?php $time_start = microtime(true); include '5k.php'; // 5k.php or 50k.php $time_end = microtime(true); echo ($time_end - $time_start).s; ? System specs: Ubuntu 12.04 LTS with Apache 2.2.22 and PHP 5.3.10 default config with no cache etc. AMD Phenom X4 9550 (2.2GHz) 4 GB DDR2-800 Disk where PHP files at: WD 500GB with average read speed of 79.23 MB/s (as Measured with hdparm) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 3:28 PM, Matijn Woudt tijn...@gmail.com wrote: On Tue, Aug 28, 2012 at 7:18 PM, Adam Richardson simples...@gmail.com wrote: Finally, you're the first one that actually has measured something. You should redo your test with real world files, because in real world functions aren't that small. In terms of redoing the test with real world files, that's an entirely different debate (and one I won't enter into at this time, though this list has discussed this topic before, most recently in a post Ted made talking about screen height.) The point is, there is a real difference. The question remains if the difference is enough to act on in future code bases (and I would say yes if my tests showed this difference, you may say no.) In functions with more lines (say ~100 lines per function), you'll see a different ratio between 5k and 50k. In my tests it is: - 5K: 22ms - 50K: 34 ms Those trends/results depend significantly on the contents of the functions, too. The overly simplistic example we've used both helps and hurts the analysis (I'll admit my example likely has more functions than other 5K/50K files, and I suspect most functions require more complicated work behind the scenes to build up than echo statements.) The point I'd make here is that it's very difficult to have apriori knowledge of how something will perform without testing it. When I create files that only contain 1 function, with just a number of echo Hello world; lines until 5k or 50k, the results are: - 5K: 15 ms - 50K: 17 ms Ummm... sure. What did you say about real world before :) Have a nice day! Adam -- Nephtali: A simple, flexible, fast, and security-focused PHP framework http://nephtaliproject.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Tue, Aug 28, 2012 at 12:11 PM, Matijn Woudt tijn...@gmail.com wrote: On Tue, Aug 28, 2012 at 6:55 PM, David Harkness davi...@highgearmedia.com wrote: On Tue, Aug 28, 2012 at 4:39 AM, Matijn Woudt tijn...@gmail.com wrote: First of all, I believe [A] PHP is smart enough to not generate bytecode for functions that are not used in the current file. Think about the fact that you can write a function with errors, which will run fine until you call the function. [B] (except for syntax errors). [B] negates [A]. PHP must either parse the file into opcodes or load them from APC and further execute the top-level opcodes. That means defining functions (not calling them unless called directly), constants, global variables, classes, etc. [B] does not negate [A]. There's a difference between parsing the syntax and defining functions, classes constants and globals, and generating bytecode. In a 'normal' file I guess syntax definitions are only about 5% of the total contents, the rest can be ignored until being called. I won't claim a deep understanding of the PHP internals, but I have enough experience with varied compiled and interpreted languages and using PHP and APC that I'm confident that the process to include a file involves: 1. Load the opcodes A. Either read the file from disk and parse the PHP into opcodes, or B. Load the cached opcodes from APC. 2. Execute the top-level opcodes Any syntax errors--even those in unreachable code blocks--will cause the script to fail parsing. For example, if (false) { function foo() { SYNTAX ERROR! } } will cause the parse to fail even though the function cannot logically be defined. PHP doesn't even get that far. PHP Parse error: syntax error, unexpected T_STRING in php shell code on line 3 When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts. While out of convenience you might be tempted to include the file in every script, when considering performance alone you should include the file only in those scripts that will make use of its contents. Peace, David
[PHP] include selectively or globally?
With this question, I aim to understand the inner workings of PHP a little better. Assume that you got a 50K library. The library is loaded with a bunch of handy functions that you use here and there. Also assume that these functions are needed/used by say 10% of the pages of your site. But your home page definitely needs it. Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need? Before answering this question, let me point why I ask this question... When you include that reference, PHP may be caching it. So the performance hit I worry may be one time deal, as opposed to every time. Once that one time out of the way, subsequent loads may not be as bad as one might think. That's all because of the smart caching mechanisms that PHP deploys - which I do not have a deep knowledge of, hence the question... Since the front page needs that library anyway, the argument could be why not keep that library warm and fresh in the memory and get it served across the board? When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts. Thank you http://stackoverflow.com/questions/12148966/include-selectively-or-globally -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete halukkaram...@gmail.com wrote: With this question, I aim to understand the inner workings of PHP a little better. Assume that you got a 50K library. The library is loaded with a bunch of handy functions that you use here and there. Also assume that these functions are needed/used by say 10% of the pages of your site. But your home page definitely needs it. Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need? Before answering this question, let me point why I ask this question... When you include that reference, PHP may be caching it. So the performance hit I worry may be one time deal, as opposed to every time. Once that one time out of the way, subsequent loads may not be as bad as one might think. That's all because of the smart caching mechanisms that PHP deploys - which I do not have a deep knowledge of, hence the question... Since the front page needs that library anyway, the argument could be why not keep that library warm and fresh in the memory and get it served across the board? When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts. Thank you http://stackoverflow.com/questions/12148966/include-selectively-or-globally Since searching for files is one of the most expensive (in time) operations, you're probably best off with only a single PHP file. PHP parses a file initially pretty quickly (it's only checking syntax half on load), so unless you're having a 100MHz CPU with SSD drive, I'd say go with a single PHP file. If you make sure the file isn't fragmented over your disk, it should load pretty quick to memory. I'm not sure if PHP caches that much, but if you really care, take a look at memcached or APC. - Matijn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] include selectively or globally?
On Mon, Aug 27, 2012 at 6:54 PM, Matijn Woudt tijn...@gmail.com wrote: On Mon, Aug 27, 2012 at 10:56 PM, Haluk Karamete halukkaram...@gmail.com wrote: Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need? Since searching for files is one of the most expensive (in time) operations, you're probably best off with only a single PHP file. Maybe I misinterpreted the question, but I don't think I agree. If you have a 50K PHP file that's only needed in only 10% of the pages, then, when solely considering performance, that file should only be included on the 10% of the pages that actually use the file. Now, there are reasons where you might want to include the file globally (maintenance purposes, etc.) Loading the 50K of PHP code requires building up all of the associated infrastructure (zvals, etc.) for the user code (even if APC is used, the cached opcode/PHP bytecode still has to be parsed and built up for the user-defined classes and functions per request, even if they're unused), is certainly going to perform more slowly than selectively including the library on only the pages that need the library. Adam -- Nephtali: A simple, flexible, fast, and security-focused PHP framework http://nephtaliproject.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php