Re: [PHP-DEV] Re: Always set return_value_ptr?
On 31/08/13 14:13, Nikita Popov wrote: Is there any particular reason why we only pass return_value_ptr to internal functions if they have the ACC_RETURN_REFERENCE flag set? Why can't we always provide the retval ptr, even for functions that don't return by reference? This would allow returning zvals without having to copy them first (what RETVAL_ZVAL does). Motivation for this is the following SO question: http://stackoverflow.com/q/17844379/385378 Changes merged. Small benchmark to verify that this indeed avoids the copy: https://gist.github.com/nikic/6398090 :) Nikita, IMO, this is a material performance optimisation of the PHP internals, as it removes one of the most common unnecessary (expensive) copies. So thanks for this. It will be interesting to see the benefit on real apps such as MediaWiki. I'll pull a 5.5 snapshot and compare it to 5.5.2 :-) Regards Terry -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Re: Always set return_value_ptr?
On 30/08/13 10:43, Julien Pauli wrote: On Fri, Aug 30, 2013 at 2:30 AM, Terry Ellison ellison.te...@gmail.com mailto:ellison.te...@gmail.com wrote: There's another one in string.c, in PHP_FUNCTION(pathinfo), that could be applied as well, though there's little performance gain in avoiding the copy of a 4 element string array. BTW, looking at this pathinfo code, it doesn't do what the documentation says it does -- or at least this states that the optional argument if present should be _one_ of PATHINFO_DIRNAME, PATHINFO_BASENAME, PATHINFO_EXTENSION or PATHINFO_FILENAME. However, if a bitmask is supplied then this function returns the element corresponding to the lowest bit value rather than an error return, for example: $ php -r 'echo pathinfo(/tmp/x.fred, PATHINFO_FILENAME|PATHINFO_EXTENSION),\n;' fred This is a bizarre behaviour. At a minimum the documentation should actually state what the function does. Or do we bother to raise a patch to fix this sort of thing, given that returning an empty string (or more consistently with other functions, NULL) in this case could create a BC break with existing buggy code? This is weird, yes. It's not the lowest bit value that is returned, but the first element put in the array (as zend_hash_get_current_data() is used with no HashPosition) , which is even more confusing. How to explain that in the documentation ? :| Yes I understand that, but the code processes the elements in this dirname, basename, filename, extension order so the two statements are equivalent in implementation. I am an experienced developer but a newbie-ish to the PHP developer community, and I come back to my Q. What do we typically do if we come across such weird functional behaviour outside the documented use of a standard function? * Shrug our shoulders and say That's PHP for you. BC rules * Fix the documentation to say what the code actually does * Fix the code at the next major release, say 5.6 to have sensible error behaviour. Just interested in understanding the consensus policy here. Do I post a fix to the doc; post a fix to the code; or move on to other issues? Regards Terry
Re: [PHP-DEV] Re: Always set return_value_ptr?
On 27/08/13 10:40, Nikita Popov wrote: On Sat, Aug 3, 2013 at 8:16 PM, Nikita Popov nikita@gmail.com wrote: Hi internals! Is there any particular reason why we only pass return_value_ptr to internal functions if they have the ACC_RETURN_REFERENCE flag set? Why can't we always provide the retval ptr, even for functions that don't return by reference? This would allow returning zvals without having to copy them first (what RETVAL_ZVAL does). Motivation for this is the following SO question: http://stackoverflow.com/q/17844379/385378 Patch for this change can be found here: https://github.com/php/php-src/pull/420 The patch also adds new macros to allow easy use of this feature called RETVAL_ZVAL_FAST/RETURN_ZVAL_FAST (anyone got a better name?) If no one objects I'll merge this sometime soon. +1 Though looking through the ext uses, most functions returning an array build it directly in return_value and thus avoid the copy. I also see that you've picked up all of the cases in ext/standard/array.c where these macros can be applied. There's another one in string.c, in PHP_FUNCTION(pathinfo), that could be applied as well, though there's little performance gain in avoiding the copy of a 4 element string array. BTW, looking at this pathinfo code, it doesn't do what the documentation says it does -- or at least this states that the optional argument if present should be _one_ of PATHINFO_DIRNAME, PATHINFO_BASENAME, PATHINFO_EXTENSION or PATHINFO_FILENAME. However, if a bitmask is supplied then this function returns the element corresponding to the lowest bit value rather than an error return, for example: $ php -r 'echo pathinfo(/tmp/x.fred, PATHINFO_FILENAME|PATHINFO_EXTENSION),\n;' fred This is a bizarre behaviour. At a minimum the documentation should actually state what the function does. Or do we bother to raise a patch to fix this sort of thing, given that returning an empty string (or more consistently with other functions, NULL) in this case could create a BC break with existing buggy code? Regards Terry -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?
Johannes, Thanks, but I'll make some responses On Mon, 2013-08-19 at 17:05 +0100, Terry Ellison wrote: By way of a background. I've been doing a review of the exting code base looking at how to establishing a roadmap extend OPcache functionality across all supported OSes and SAPIs. And this raises a supplementary Q: which OSs and SAPIs should we be supporting for PHP 5.6 anyway? I would be interested in the views of the dev team on this. It would be good to agree a list of which OSs are to be supported at PHP The short version is quite simple: PHP supports everything and nothing. Our aim is to be portable and have it running anywhere somebody has a C compiler and the required libs. On the other hand, in open source spirit, we promise nothing. In reality I expect that most developers use Linux and we have a active Windows fraction. If we promise support for any platform it has two direct consequences: - We have to test and verify it - We immediately disappoint people who run PHP successfully on edge platforms And then more longterm consequences: - Mind new platforms - Continuous discussions about adding support for new platforms The current model otoh works quite well. I understand the nuances of what support means in the FLOSS world, but at some level we also be able to look at ourselves in the mirror and say that are releases are at a standard that we can feel comfortable with. As you say, we have an established Linux base as well as some Windows. I would also add a solid BSD user base (FreeBSD, NetBSD, OSX, etc.). (Maybe I'd include Solaris, but it's on the way out given Oracle's position.) But what about all of the other obsolete platform code? We ship this with our PHP source, version after version, knowing that its never being exercised or tested. Surely if and when we want to improve PHP's platform support and architecture, then this stuff is like chains dragging at out feet. For example, I know how to make OPcache work well for other SAPIs such as cli, mod_fcgid, etc. but only if I can refactor the necessary chunks and only for POSIX and Win32 which covers the platforms we've just discussed; worrying about of the other flavours that I could never test just gives me too much brain ache, but I can't propose a code refactor if I don't know what to do with it. PHP (unlike some language alternatives) seems to be doing little to improve general performance, and the discussions related to performance on this DL are almost non-existent. 5.6, which SAPIs are supported, and a matrix of which SAPIs are supported on non-threaded and build TSRM variants. I myself would kill TSRM, but others have reasons to disagree ;-) In general: There are features which are dependent on operating system, 3rd party library or TSRM. This is fine. Based on my statement from above I claim (again, there are people who disagree for reasons I follow less than the general case above) that nobody who cares about performance uses TSRM, as such an opcode cache is not needed in such environments. Yes, if the Zend engine had first been developed for Windows, then it would have supported proper multi-threading from Day 1. It wasn't, so TSRM is a cludge to achieve this. Dmitry made the comment somewhere that enabling TSRM incurs a ~20% performance hit, which I can believe, but as far as I can see, WIN32 implementations rely on it for scaling. Looking at fpm, cgi, etc. all of the SAPIs which rely on a master/child process hierarchies for scaling use fork, and they all have a big #ifndef ZEND_WIN32 around this code. OK, CreateProcess incurs more startup overhead than forking and there are other startup issues to address relating to acquisition of context, but I've written serious realtime systems for WIN32 in the past that performed well without threads. So I am not sure why this is the case. However the lack of OPcode caching on complex apps typically halves system throughput. Most if not all admins care about that sort of hit. Examples of what I am talking about are SAPIs with no clear evidence of active support (I've listed the last non-bulk change in brackets to give a measure of the level of support): aolserver (2008), caudium (2005), continuity (2004), nsapi (2011), phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002), tux (2007), webjames (2006) I realise that some of these may still be actively used with a user community out there wanting to track current versions, and this is just a case of if ain't broke... However, I do wonder when some of these were actively maintained and routinely tested against the current versions at release -- and if not then perhaps PHP 5.6 is the correct point to retire them from the source tarball and configure options? First thing to note is that the SAPI layer is one of the most stable ones. So old SAPIs most likely work. Secondly: Yes some of them almost certainly can go, when we discussed
Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?
On 20/08/13 16:50, Johannes Schlüter wrote: snip Terry Ellison wrote: PHP (unlike some language alternatives) seems to be doing little to improve general performance, and the discussions related to performance on this DL are almost non-existent. Looking at any benchmark from 5.2 to 5.3 to 5.4 and 5.5 shows notable improvements (5.4 to 5.5 maybe not as much as the others) saying we do little is a bit misleading. But well, it is simpler to do these syntax sugar things we're bikeshedding about than doing actual core improvements. We have just very few people fully understanding the engine and being able to improve it. So such discussions gain no traction. I apologise if this sounded as unreasonably critical, as this wasn't my intent. As it happens, my particular interest is in PHP performance and I've got a good understanding of the Zend Engine and opcache, but trying to work out how I can contribute effectively to this is difficult for me given this lack of traction. I also know that Dmitry and you guys made some fundamental improvements to the 5.4 engine that significantly dropped the op_array sizing and lead to perhaps an overall 5-15% performance improvement. I discussed this in some depth on my OPcache documentation on this page: https://github.com/TerryE/opcache/wiki/The-Zend-Engine-and-opcode-caching#wiki-Comments_on_Zend_engine_performance However, I don't think that this is appreciated in the wider PHP community (for example I can't recall it ever being discussed or emphasised on StackOverflow). I feel that it got lost in the reaction to APC not working reliably with the early 5.4 dot releases. Regards Terry
[PHP-DEV] Which OSs and SAPI should PHP 5.6 support?
By way of a background. I've been doing a review of the exting code base looking at how to establishing a roadmap extend OPcache functionality across all supported OSes and SAPIs. And this raises a supplementary Q: which OSs and SAPIs should we be supporting for PHP 5.6 anyway? I would be interested in the views of the dev team on this. It would be good to agree a list of which OSs are to be supported at PHP 5.6, which SAPIs are supported, and a matrix of which SAPIs are supported on non-threaded and build TSRM variants. Examples of what I am talking about are SAPIs with no clear evidence of active support (I've listed the last non-bulk change in brackets to give a measure of the level of support): aolserver (2008), caudium (2005), continuity (2004), nsapi (2011), phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002), tux (2007), webjames (2006) I realise that some of these may still be actively used with a user community out there wanting to track current versions, and this is just a case of if ain't broke... However, I do wonder when some of these were actively maintained and routinely tested against the current versions at release -- and if not then perhaps PHP 5.6 is the correct point to retire them from the source tarball and configure options? Likewise in the Zend, TSRM, ext/opcache ... sources, there is conditional code dependent on BeOS, __sgi, __osf__, __IRIX__, NSAPI, PI3WEB, GNUPTH(*), OS_VXWORKS, etc. as well as obsolete BSD versions -- OSs that are no longer actively supported. Again I ask the Q how and when are these tested and if not then shouldn't we retire this support? Part of my reasons for asking this is work in preparation for OPcache issue #118 -- Transparent SHM reuse. Doing this with robustly with good performance characteristics -- for *all* currently referenced OSs -- is a pain. Reviewing a range of other best-of breed packages which use shared SMA-based resources, it seems to me that the memcached approach is the cleanest: it uses the POSIX APIs and supports any OSes which support these APIs. If we limited TSRM and OPcache support at PHP 5.6 to two code variants, POSIX + WIN32, surely this would still cover all the major supported OSes? //Terry Ellison (*) GNU threads is still supported but it prevents utilisation of SMP systems and there is a minimal performance differences from POSIX threads on a single processor system. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?
Uwe Schindler wrote: I would update NSAPI as I always did, there were just no new bugs and code is very stable (to the extend of stableness of multithreaded SAPIs). It is still also in use on some of my servers, so I would still help support it. UWE, If its used on some of your servers, and you are supporting it then it doesn't belong on my suggested list. :-) At the moment I did not follow recent commits to SAPI-related code, so I have to closer look into it. Are there any RFCs related to changes coming in 5.6 for OPcache? Not currently. -Original Message- From: Terry Ellison [mailto:ellison.te...@gmail.com] Sent: Monday, August 19, 2013 6:05 PM To: internals@lists.php.net Subject: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support? By way of a background. I've been doing a review of the exting code base looking at how to establishing a roadmap extend OPcache functionality across all supported OSes and SAPIs. And this raises a supplementary Q: which OSs and SAPIs should we be supporting for PHP 5.6 anyway? I would be interested in the views of the dev team on this. It would be good to agree a list of which OSs are to be supported at PHP 5.6, which SAPIs are supported, and a matrix of which SAPIs are supported on non-threaded and build TSRM variants. Examples of what I am talking about are SAPIs with no clear evidence of active support (I've listed the last non-bulk change in brackets to give a measure of the level of support): aolserver (2008), caudium (2005), continuity (2004), nsapi (2011), phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002), tux (2007), webjames (2006) I realise that some of these may still be actively used with a user community out there wanting to track current versions, and this is just a case of if ain't broke... However, I do wonder when some of these were actively maintained and routinely tested against the current versions at release -- and if not then perhaps PHP 5.6 is the correct point to retire them from the source tarball and configure options? ...
Re: [PHP-DEV] execute compressed PHP command-line application
crankypuss wrote: ... I don't want to have to modify the interpreter at this point... Sorry, but this list is for just this purpose, so you post does belong on the DL. Regards Terry PS. read up on PHAR extensions and use of streams. There's nothing stopping you specifying a phar or even a compress.zlib stream on the commandline.
Re: [PHP-DEV] Moving PHP documentation to Git repository
On 25/06/13 07:46, Christian Stoller wrote: What do you think about moving the PHP documentation to a Git repository, mirrored on Github? Doing this would make it possible for everybody to extend the documentation easily by creating pull requests. Today one has to get an SVN account to edit the docu or you have to use https://edit.php.net/ which does not work as expected (at least for me when I tried to update some German documentation). My changes have not been integrated for some months (I had to write an email to somebody of the doc team to apply the changes). Symfony does it this way (see https://github.com/symfony/symfony-docs/) and I like it very much. It is really easy to extend/update parts of the docu which are not complete or outdated and I am sure that it is comfortable and timesaving for the doc team, too. +1 Regards Terry Ellison
Re: [PHP-DEV] Request for testers of and feedback on GCI-enabled OPcache
On 22/06/13 00:01, Martin Amps wrote: When do you expect to have support for 5.5? I’d be happy to test it on a few of our servers as soon as you are Martin Amps | CIO www.iCracked.com http://www.icracked.com/ iCracked | Redwood City, CA Martin, The latest commit supports PHP 5.5. Only had to change one #ifdef //Terry On Jun 21, 2013, at 3:52 AM, Terry Ellison ellison.te...@gmail.com mailto:ellison.te...@gmail.com wrote: The Multi-Level Cache (MLC) OPcache fork typically delivers 80% of the performance acceleration of standard OPcache for the CLI and GCI SAPI modes. (OPCache and other cache accelerators don't functionally support these modes). The last build is now pretty stable in that it runs runs the PHP test suite and MediaWiki under GCI happily. It also has greater savings for the I/O load associated with script compilation for these modes. (In other SAPI modes, it runs the standard OPcache functionality and therefore delivers 100% of the OPcache benefits). However, I now need others actively to evaluate this alpha code and give feedback on its performance and its configuration interface if we are going to move towards promoting the introduction of this or some variant thereof into the PHP core. So my request here is to those CGI SAPI mode users on these lists to help support this work. Many of you have complained about the poor performance of PHP in CGI and now is your opportunity to help address this. You will find an overview of MLC OPcache at: https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and can pull the latest code from: https://github.com/TerryE/opcache/archive/dev-filecache.zip If you would like to help then please download, build and try out this version, respond tohttps://github.com/TerryE/opcache/issues/3and use the Github issues tracker for MLC-OPcache-specific discussion. Only use these mailing lists for comment that you feel has wider interest to the list subscribers. Thank-you and regards Terry Ellison Caveat: I've only tested this Alpha version on 64bit Linux configurations for PHP 5.3 and 5.4, and would therefore like to limit initial testing to these configurations at this stage. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit:http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Request for testers of and feedback on GCI-enabled OPcache
The Multi-Level Cache (MLC) OPcache fork typically delivers 80% of the performance acceleration of standard OPcache for the CLI and GCI SAPI modes. (OPCache and other cache accelerators don't functionally support these modes). The last build is now pretty stable in that it runs runs the PHP test suite and MediaWiki under GCI happily. It also has greater savings for the I/O load associated with script compilation for these modes. (In other SAPI modes, it runs the standard OPcache functionality and therefore delivers 100% of the OPcache benefits). However, I now need others actively to evaluate this alpha code and give feedback on its performance and its configuration interface if we are going to move towards promoting the introduction of this or some variant thereof into the PHP core. So my request here is to those CGI SAPI mode users on these lists to help support this work. Many of you have complained about the poor performance of PHP in CGI and now is your opportunity to help address this. You will find an overview of MLC OPcache at: https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and can pull the latest code from: https://github.com/TerryE/opcache/archive/dev-filecache.zip If you would like to help then please download, build and try out this version, respond to https://github.com/TerryE/opcache/issues/3 and use the Github issues tracker for MLC-OPcache-specific discussion. Only use these mailing lists for comment that you feel has wider interest to the list subscribers. Thank-you and regards Terry Ellison Caveat: I've only tested this Alpha version on 64bit Linux configurations for PHP 5.3 and 5.4, and would therefore like to limit initial testing to these configurations at this stage. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Internal object orientation documentation available!
On 10/06/13 19:33, Nikita Popov wrote: We just published some rather extensive documentation on internal object orientation: http://www.phpinternalsbook.com/classes_objects.html This is part of a larger project aimed at documenting the engine and making it accessible to new contributors. This looks like an excellent beginning so thanks. A few general comments: 1) I notice that your book is © Copyright 2013, Julien Pauli - Anthony Ferrara - Nikita Popov. All Rights Reserved rather than GDFL or one of the CC variants of open document licences. They only issue that I see here is that I -- and possibly others -- might be a bit guarded in providing comment and input if that content was being transferred to the authors unconditionally. Also if you are reserving all rights then you will need to be careful to ensure that all the content is yours and not extracted from an open or other 3rd party source. Surely this going to add to your authoring burden? 2) Wikipedia, for example, contains a lot of good in-depth explanation of CompSci concepts and standard patterns such as http://en.wikipedia.org/wiki/Hash_table. You might consider the content cut: when you include basic discussion of 101 principles (e.g. on HashTables); and when you limit your content to their PHP-specific implementation, with suitable references to the 101 stuff. Tending to the former will make the book a lot longer, albeit standalone. Your call, but I would have thought that the majority of the readership by nature will have some CompSci background and so want to skip the 101 stuff, or be referenced out to the appropriate in-depth WP or other reference. 3) What is your preferred markup format for feedback and contributions? E.g. do you maintain an ODF or Docbook XML under some accessible git repository, or is is a case of (for example) hashtables/basic_structure.html para at line 138. Not quite true that the arBucket array will never shrink down: you can not reduce a PHP array, you only can grow it. You can always implement your own resizer by realloing the arBucket array and the calling zend_hash_rehash() to do this. (This would be a good standard hash API function by the way. But good luck and this will be an extremely useful project to help those wishing to get to grips with PHP internals. Regards Terry (Resend including internals list) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Re: Multi-level Caching Fork of OPcache -- update
On 10/06/13 09:20, Dmitry Stogov wrote: Sorry for slow response. I'm very busy with other work and have no time for MLC OPcache review. I don't think we can include it into main tree before 5.5.0 release anyway. But in general I think we may include your work in the future releases. Also, thanks for useful reports about problems you've found in OPcache :) Thanks. Dmitry. Dmitry, One useful side-effect of writing the MLC support is that I've really had to take apart the core OPcache code to understand how it works. It's probably the first in-depth review that this extension has had from someone _outside_ the Zend team, so its only to be expected that anyone doing this would find a few issues. What I do think needs to be said it that I think that you guys have done a fantastic job here in this development. 9 times out of 10 when I've initially thought why didn't they do it this way? when digging into the code, I've dug down in and discovered that you already had, or had approached it a better way. IMO, the whole OPcache approach is tighter and more sound than that of APC. Take one example of this: the 2-pass algo for compiled scripts which enables the storage for a compiled script be to allocated as a single storage unit. This has two major performance benefits at runtime: 1) The memory allocator overheads of preparing scripts for execution (and deallocation at rundown) are reduced by more than an order of magnitude. 2) The memory needed to execute the script is in a contiguous memory areas, and this gives improved hardware (as in L1/L2/L3) caching which passes through to a runtime performance improvement. There are a couple of things that I would refactor if I had written OPcache. (I'll raise a couple of issues on these to discuss what I mean in more depth. and when the MLC work reaches a plateau if you think its worthwhile I can cut a couple of branches to show you a possible solution.) A) The SMA startup bootstrap is just messy and needs refactoring. B) The simple dead-and-rebirth method of refreshing caches isn't going to scale well on real systems. Terry (Note the new email addr that I am using for php.net work, as this one isn't being blocked by the php.net server) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: Multi-level Caching Fork of OPcache -- update
Dmitry, Hi Terry, I don't have time right now (on this week), but I'll definitely take a look into your patch later. Thanks. Dmitry. Hi and thanks for this. I won't have the full functionality in place for another month or so, though my pushes to my github repository should be fully functional on the main path and subject to caveats in the TODO, etc., so it's just more general guidance when you get time, e.g I would be happier if you approached X this way, or don't forget to address issue Y which we've been burnt on in the past... Also have a scan through the wiki pages for B/G design info. If you guys want, I could also do the equivalent for standard OPcache down the line, since I know have a pretty intimate knowledge of how it works; I would just need to know the target audience that you would like to address. Regards Terry On Sat, May 4, 2013 at 5:29 PM, Terry Ellison te...@ellisons.org.uk mailto:te...@ellisons.org.uk wrote: Please treat this email by way of request for feedback from the OPcache developers and anyone interested in influencing my next steps on my https://github.com/TerryE/opcache fork of OPcache and specifically on the dev-filecache branch. The most appropriate channel is probably https://github.com/TerryE/opcache/issues -- unless you think that the comments have wider applicability for either the PECL or DEV communities. My ultimate aim is to take this to a point where the OPcache developers feel sufficiently comfortable to consider merging a future version back into OPcache. I have added some detailed project wiki pages documenting my scope and progress and in particular on https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and a brief quote from the page: An indication of the potential performance benefits of OPcache CLI mode can be seen from a simple benchmark based on 100 executions of the MediaWiki runJobs.php maintenance batch script. This compiles some 44 PHP sources, comprising 45K lines and 1,312 Kbytes. The cached version reads a single runJobs.cache file of 1,013 Kbytes. Time in mSec Average Stdev Uncached Execution 179 7 Cached Execution77 7 (Image Load Overhead) 18 3 In other words for this script, the MLC cache is delivering an approximate 60% runtime saving. Of course this is only a point test, and benefits will vary -- though I hope that switching to LZ4 compression will improve these figures further. But even this one point challenges what seems to be a core PHP development dogma: there's no point in using a file cache, because it makes no material performance difference. Even this build *does* deliver material benefits , and I suggest that there is merit in moving to including MLC cached modes to accelerate CLI and GCI SAPI modes using this or a similar approach. From an internals -- rather than PECL -- viewpoint what this would mean is that non-cached incremental compile-and-go execution modes would now be the exception than the norm -- largely negating the disadvantages of any compile-intensive optimization options. So thank-you in anticipation for your feedback. I will do my utmost to respond constructively to all comments. :-) Regards Terry Ellison PS. Apologies in advance: I am up country at my cottage on an Island in the north Aegean with the nearest Wifi some walk away, so my Internet access is limited at the moment, and I might take some time to respond.
[PHP-DEV] Multi-level Caching Fork of OPcache -- update
Please treat this email by way of request for feedback from the OPcache developers and anyone interested in influencing my next steps on my https://github.com/TerryE/opcache fork of OPcache and specifically on the dev-filecache branch. The most appropriate channel is probably https://github.com/TerryE/opcache/issues -- unless you think that the comments have wider applicability for either the PECL or DEV communities. My ultimate aim is to take this to a point where the OPcache developers feel sufficiently comfortable to consider merging a future version back into OPcache. I have added some detailed project wiki pages documenting my scope and progress and in particular on https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and a brief quote from the page: An indication of the potential performance benefits of OPcache CLI mode can be seen from a simple benchmark based on 100 executions of the MediaWiki runJobs.php maintenance batch script. This compiles some 44 PHP sources, comprising 45K lines and 1,312 Kbytes. The cached version reads a single runJobs.cache file of 1,013 Kbytes. Time in mSec Average Stdev Uncached Execution 179 7 Cached Execution77 7 (Image Load Overhead) 18 3 In other words for this script, the MLC cache is delivering an approximate 60% runtime saving. Of course this is only a point test, and benefits will vary -- though I hope that switching to LZ4 compression will improve these figures further. But even this one point challenges what seems to be a core PHP development dogma: there's no point in using a file cache, because it makes no material performance difference. Even this build *does* deliver material benefits , and I suggest that there is merit in moving to including MLC cached modes to accelerate CLI and GCI SAPI modes using this or a similar approach. From an internals -- rather than PECL -- viewpoint what this would mean is that non-cached incremental compile-and-go execution modes would now be the exception than the norm -- largely negating the disadvantages of any compile-intensive optimization options. So thank-you in anticipation for your feedback. I will do my utmost to respond constructively to all comments. :-) Regards Terry Ellison PS. Apologies in advance: I am up country at my cottage on an Island in the north Aegean with the nearest Wifi some walk away, so my Internet access is limited at the moment, and I might take some time to respond. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Continuous Integration Atomic Deploys and PHP 5.5
Rasmus, snip - Request 1 starts before the deploy and loads script A, B - Deploy to a separate directory and the docroot symlink now points to here - Request 2 starts and loads A, B, C - Request 1 was a bit slow and gets to load C now The issues that you raise about introducing atomic versioning in the script namespace do need to be addressed to avoid material service disruption during application version upgrade. However, surely another facet of the O+ architectural also frustrates this deployment model. My reading is that is that O+ processes each new (cache-miss) compile request by first sizing the memory requirements for the compiled source and then allocating a single brick from (one of) the SMA at its high water mark. Stale cache entries are marked as corrupt and their storage is then allocated to wasted_shared_memory with no attempt to reuse it. SMA exhaustion or the % wastage exceeding a threshold ultimately triggers a process shutdown cascade. This strategy is lean and fast but as far as I understand this, it ultimately uses a process death cascade and population rebirth to implement garbage collection. Wouldn't your non-stop models would require a more stable reuse architecture which recycles wasted memory stably without the death cascade? Perhaps one of the Zend team could correct my inference if I've got it wrong again :-( Regards Terry
[PHP-DEV] Re: [PECL-DEV] [Proposal] New Extension Yac (a user data cache base on shared memory without locks)
On 23/03/13 06:29, Laruence wrote: since Zend O+ has bundled into PHP since 5.5, and O+ is really a bit faster than APC, so people may want to migrate to O+, but there is no User Data Cache in O+ ... Laurence, you are correct that O+ doesn't provide data caching, but what about memcached and the PECL packages that support it? http://pecl.php.net/package/memcache and http://pecl.php.net/package/memcached Regards Terry
Re: [PHP-DEV] Re: [PECL-DEV] [Proposal] New Extension Yac (a user data cache base on shared memory without locks)
On 23/03/13 09:46, Matīss Roberts Treinis wrote: Memcached is distributed caching system, where as APC's user data cache is not. Memcached requires separate server instance (memcached) to operate. APC does not. Yes, but there is nothing to stop an admin of an application-dedicated system or VM configuring and using an in-server memcached. Also, APC's user cache is 5+ times faster than memcached. If some extension is to provide this functionality, it has to be as close as possible in possibilities and speed as APC's implementation has. Memcached is not and never hasn't been an alternative for APC, they are meant for two different jobs. I also agree that memcache is slower because it is out of process and that for some usecases the relative speed differences due to these context switches will impact application performance. Yes, they have different sweet-spots and operational characteristics, but for many usecases the relative impact will be immaterial, and memcached can be a perfectly acceptable substitute. Applications which are closely coupled to high APC data cache usage will probably stay with APC for the foreseeable future. An SMA-based data cache would be a useful adjunct to O+, so I will be interested in this, but I just don't see this filling a show-stopper gap that must be addressed as a priority. snip Laurence, you are correct that O+ doesn't provide data caching, but what about memcached and the PECL packages that support it? http://pecl.php.net/package/memcache and http://pecl.php.net/package/memcached
Re: [PHP-DEV] O+ support for PHP 5.x (Was current Status of O+ on Windows)
On 02/03/13 08:39, Zeev Suraski wrote: The current vote that's going on right now deals with putting the extension into PHP itself. If that happens (which seems awfully likely at this point), why do we need it in PECL? My response to your Q is that there is probably going to be quite a lot of interest in an O+ package that is usable with PHP 5.3 and 5.4. Surely a PECL package will have a quicker uptake terms of getting it out into the wider PHP developers community and into production, especially if the main Linux distros add a precompiled php5-optimizer-plus package (or whatever their naming convention is). Would you see such O+ support for the existing supported versions best done through the PECL route or swept up into a maintenance dot release? Regards Terry
[PHP-DEV] Optimizer+ bugreps
At what point is O+ reporting going to be possible through https://bugs.php.net/ ? I realize that this is a bit of a catch-22, but surely it would be better to allow properly tracked open bug reporting sooner rather later? Regards Terry
Re: [PHP-DEV] Optimizer+ bugreps
On 02/03/13 09:34, Pierre Joye wrote: Having it in peck right now allows that. But as of now it is not a PHP.net project so it makes little sense to have it listed there. On Mar 2, 2013 10:33 AM, Terry Ellison te...@ellisons.org.uk mailto:te...@ellisons.org.uk wrote: At what point is O+ reporting going to be possible through https://bugs.php.net/ ? I realize that this is a bit of a catch-22, but surely it would be better to allow properly tracked open bug reporting sooner rather later? Thanks Pierre, I understand and that's why I mentioned catch-22. AFAIK, there's no open bug and issue reporting available prior to its formal adoption, event though we all realize that it's going to be pretty much inevitable -- for compelling reasons -- and by the time it is adopted the first release will be a fait accompli.
Re: [PHP-DEV] Optimizer+ bugreps
On 02/03/13 17:42, Christopher Jones wrote: I realize that this is a bit of a catch-22, but surely it would be better to allow properly tracked open bug reporting sooner rather later? Bugs can (and have been) reported via https://github.com/zend-dev/ZendOptimizerPlus/issues I'm sure email reports will also do fine in the interim. I guess this is a case of Du, my bad. Zeev gave the github URI in his initial announcement. I should have done a 1+1 ... Thanks Chris, sometimes an =2 is very useful :oops; //Terry
Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()
On 03/02/13 15:27, Hans-Juergen Petrich wrote: In this example (using php-5.4.11 on Linux) the memory will grow non-stop: for ( $fp = fopen('/dev/urandom', 'rb'); true;) { eval ('$ano_fnc = function() {$x = '.bin2hex(fread($fp, mt_rand(1, 1))).';};'); echo Mem usage: .memory_get_usage().\n; } But in this example not: for ( $fp = fopen('/dev/urandom', 'rb'); true;) { eval ('$ano_fnc = function() {$x = '.bin2hex(fread($fp, 1)).';};'); echo Mem usage: .memory_get_usage().\n; } Hans-Juergen, I've raised a bugrep https://bugs.php.net/bug.php?id=64291 which you might want to review and add any appropriate comments. I had to think about this one. It's worthwhile observing that this second example is the only occasion, as far as I know, that PHP does any garbage collection of code objects before request shutdown. For example create_function() objects are given the name \0LambdaN where N is the count of the number of created functions so far in this request. They are registered in the function table and persist until request shutdown. That's the way PHP handles them by design. As I said in the bugrep, the normal behaviour of persistence is what you want because if you think about the normal use of the anonymous function, say while (!feof($fp)) { $line = preg_replace_callback( '|p\s*\w|', function($matches) {return strtolower($matches[0]);}, gets($fp) ); echo $line; } Then the anonymous function is compiled once and rebound to the closure object which is passed as the second argument for the callback each time through the loop. OK, doing the closure CTOR/DTOR once per loop. is not the cleverest of ideas and this is the sort of thing that would be hoisted out of the loop in a language which did such optimization, but PHP doesn't. It's a LOT better that compiling a new function each loop (which is how Example #1 on http://php.net/manual/en/function.preg-replace-callback.php does it!) This is what you want to happen. It's just too complicated for PHP to work out if the function might or not be rebound to. I suspect the bug here is really the implicit assumption that the magic function name generated by the eval ('$ano_fnc = function() { ... }'); is unique, but as your examples shows, thanks to garbage collection and reuse of memory, sometimes it isn't. In these circumstances thank to the use of a hash update and the table DTOR the old one is deleted. So assume that what you are doing is exploiting a bug, so my advice is not to do this. It might be fixed in a future release. Regards Terry
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
Ramus, thanks for your detailed response. NFS is so common for sharing files that ... This is simply not true. I do have a fair bit of experience in this field, and I don't know of any major sites that do this and I have worked with a good chunk of the largest sites out there. Eh??? Fortune 500 enterprises and governmental departments are pretty conservative. NAS and SAN based iSCSI and FCoE based elastic block storage give great performance for server-specific file-systems, but Brendon is right: for distributed file systems, NFS and CIFS still dominate. By major I meant traffic-wise, not Fortune-500, although there are some of those on the list too. I mostly work with medium-to-large scale Internet companies. Think Yahoo, Facebook, Flickr, Digg, Etsy, WePay, Room77. These types of companies would never consider serving all their Web traffic from NFS. Yes, Yahoo had a ton of Netapp filers as well, but this was for shared data storage, they would never consider putting their application logic on them. Now I agree with you: for this sector of Internet B2C companies, their business is centred around a small number of apps that dominate their revenue streams, so of course they are free to design their infrastructure architecture to optimize the performance of these apps. I also accept that this sector was and is directly or indirectly the major funder of PHP development effort. However, my counter point is that this is no longer the only infrastructure usecase for PHP. Now mature, it has entered other sectors and Brendon and Daniel posts highlight two of them: * Enterprise use as Brendon raises. Enterprises have moved to use internet based technologies to automate internal business processes. These apps work on the company intranet, not on the internet. So when you book a car or go to your bank or order a part from a manufacturer, the assistant may well be sitting in front of a PHP app that never sees the internet but is still core to that business. Thanks partly to the flexibility of cloud resources, CIOs and CTOs are increasingly looking at open technologies such PHP to replace MS ones. Incidentally IMO, its this sort of business stream that will provide hard funding to value-add companies such as Zend. * The hosting service providers as Daniel raises. In terms of sheer numbers this is that largest community of PHP users who buy their +/- $100pa service from a hosting provider. They still care about performance. The providers care about the efficiency of their infrastructure. They (initially) using PHP because Wordpress, Mediawiki, ... are written in it. But this is also a major entry vehicle for a new generation of PHP developers to get an initial internet presence. If PHP runs 3x slower than language X, and X is just as flexible then we are putting up unnecessary barriers to their entry and turning away that new cadre. This is also something that has been like this for 10+ years and nobody has stepped up to fix it so far. It shouldn't be news to anyone that stats and opens over NFS are slow. I am not sure why it should suddenly be an urgent problem for us at this point. But like I said, we may get to it. It's not suddenly urgent but perhaps this is more a question of maybe hitting a tipping point where it might now be wise to address this issue. If the integrated opcode cache happens it becomes easier to manage the flow between the compiler, the cache and the executor and we can probably optimize some things there. +1 And as I mentioned in another thread, let's see some RFCs proposing how to fix some of these things rather than simply posting I wish the PHP devs would do this.. type of messages. These go over really badly with most of the longtime contributors here and they even tend to have the opposite of the desired effect. As I have posted separately, I forked and then rewrote APC to address this sweet spot. OK my LPC is very a much bug-ridden alpha code that fails 10% of the PHP test suite largely due to extension interoperability issues, and I've had other things to do this last month -- including deciding whether to switch to a proper O+ delta. However, my aim was for me to use this as an evaluation test bed, not a serious production contender. However, now that I've written an opcode cache which runs Mediawiki under php-cgi (with ~ 5% of the NFS getattrs, BTW), rolling some key tweeks into the Zend compiler, execution environment and APC -- which I understand well and should be straight forward -- or O+ -- which I don't as yet. My challenge is deciding (i) do I work on PHP 5.6 / 5.7 and the corresponding beta APC version which at current rates of adoption might have begin to have an impact in the community sometime in the next 5 years, or (ii) work on a performance patch to the stable APC version which is typically installed with PHP 5.3 which these guys could apply within a few months.
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 22/02/13 11:20, Ferenc Kovacs wrote: My challenge is deciding (i) do I work on PHP 5.6 / 5.7 and the corresponding beta APC version which at current rates of adoption might have begin to have an impact in the community sometime in the next 5 years, or (ii) work on a performance patch to the stable APC version which is typically installed with PHP 5.3 which these guys could apply within a few months. or contribute those patches back and integrate them into the vanilla apc? Humm. I think that we are sort of saying the same thing, but at cross purposes. Of course I should offer any up patches for mainstream APC and at best these will go into 3.1.14 or 3.1.15 and may then get adopted sometime for production systems whenever -- that's only if the release of a core O+ doesn't drop APC into legacy status. However Ubuntu 12.04-LTS is a good example of a stable production stack and this uses PHP 5.3.10 and APC 3.1.7. Debian Squeeze is even further behind and it runs PHP 5.3.3 and APC 3.1.3. A performance patch could also be made available based on the last stable version of APC, say 3.1.9 -- that is before the attempts to support the new PHP 5.4 features destabilised it. With this patch, then at least individual system admins would have the option to download a stable version from PECL + patch it to use with their production stacks within the next 3-6 months. Regards Terry
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 19/02/13 01:30, Kevin Yung wrote: In our environment, we use NFS for shared storage, we are using APC as well with stat=0. In our setting, we also experiencing high number of stat() calls on our file system. My initial finding of this problem is we enabled the open_basedir setting. And there is already a bug report for this, https://bugs.php.net/bug.php?id=52312 We tested the issue in 5.2.x, 5.3.x and 5.4.x, all of them experiencing same issue. Kevin, I've just walked through this in 5.3 and 54 and updated this bugrep. In short there is some silly coding here which should be addressed. Even if we accept that PHP should comply with http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-5178 if open_basedir is set, then the cache should only be ignored on the actual open itself, as this is the only one that is exploitable, but let's have this debate on the bugrep. Let me think about the security and other NFRs and propose a patch.
Re: [PHP-DEV] Give the Language a Rest motion (fwd)
Here is a counterpoint to that expressed by Lars. Many if not most shared hosting providers don't offer PHP 5.4 yet. Ditto many enterprises have yet to adopt it. The main reason? I think its that old Backwards Compatibility issue that has been discussed heavily on this DL. When major apps like mediaWiki break with a new release of PHP (see http://www.mediawiki.org/wiki/Compatibility, and this is quite typical), upgrading PHP versions represents a major headache for both hosting providers and larger enterprises that want to maintain standard infrastructure build templates, as each none-BC PHP upgrade represents a major cost in either loss of customer satisfaction or IT investment for little or no tangible business benefit. New features are often nice for the app developer, so the result is that they then get used by apps development teams, and the provider or infrastructure team then has to manage the ripple effects on a complex infrastructure permitted configuration matrix across hundreds or thousands of apps. I am not saying that the PHP dev team should freeze PHP, but what I am suggesting is that the PHP team should also consider the compatibility impacts across versions so that enterprises and hosting providers who have adopted PHP can control their through-life maintenance costs. There are things that the PHP team could do to help mitigate this issue -- for example producing standard templates so that, say PHP 5.3, 5.4 and 5.5 based apps can coexist and perform (e.g. with APC or O+) on the *same* Apache2 (or nginx, ...) stack. Change is good, but too much change too fast without regard to the cost consequence will ultimately alienate the CIOs and CTOs who set platform policies. On 20/02/13 23:15, Lars Strojny wrote: As a general reply: I’d like to disagree, and here is why. Yes, we should not let half baked features in but we need to add more and more features, also syntax wise. For three reasons: - Parity/expectations/me too: so you can do that in PHP as well - Expressiveness: allow better ways to express the same idea in more concise ways - Innovation: bring unexpected features to the language people didn’t even expect Let’s recall a few of the latest additions: - 5.3: namespaces. Provided the foundation for awesome stuff like PSR-1, which in turn provides the foundation for the even more awesome stuff composer is. - 5.3: goto. A good thing we can do it. I'm not sure for what exactly but I am sure there is somebody out there :) - 5.3: Closures, huge thing for us, a matter of parity to other languages. Really changes the face of a lot of APIs (see e.g. Doctrine transactional(), the whole micro framework movement, React) - 5.4: Closures 2.0 with $this binding. Honestly, without it, Closures are a little meh. But it was good we waited and got it right. - 5.4: Short array syntax. A parity/culture issue. - 5.4: Traits, I am happy we got horizontal reuse right - 5.4: array dereferencing. Very small but useful. To me it feels more like a bugfix - 5.4: callable type hint. Small change with a huge impact - 5.5: Generators, also a matter of parity and a matter of awesomeness - 5.5: ClassName::class syntax. A really good improvement to the overall usability of namespaces. Just imagine how much shorter unit test setUp() methods will become What we have on our list that, from my perspective, will sooner or later hit us: - Property accessors in some form or another: a lot of people seem to like it. - Annotation support: we would have a lot of customers for it. - Autoboxing for primitives. Allows us to fix a lot of problems in ext/standard. - Unicode. Obviously. - Named parameters. A recurring topic might be a topic worth digging deeper. - I'm positive the Generics discussion will arise at some point again. … and these are just the changes on a syntax/semantics level, I'm not talking about all the awesome technologies (one of which you are working on) we need to integrate tighter and eventually bundle with core. I don’t believe we should let our users outgrow the language, quite the opposite, we should grow with our users and the broader web community, otherwise we will fail. PHP is nowadays used for tasks it never was intended to be used but that’s a good thing. We need to continuously adapt. What’s true for software projects is true for languages: stop improving actually reduces its value over time. cu, Lars
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 21/02/13 23:38, Rasmus Lerdorf wrote: On 02/21/2013 03:15 PM, Brendon Colby wrote: NFS is so common for sharing files that saying Wow, people are still serving web files over NFS? is like saying Wow, people are still using the ls command to list directory contents on Linux? I think NFS is still very widely used, even for sharing web files. This is simply not true. I do have a fair bit of experience in this field, and I don't know of any major sites that do this and I have worked with a good chunk of the largest sites out there. Eh??? Fortune 500 enterprises and governmental departments are pretty conservative. NAS and SAN based iSCSI and FCoE based elastic block storage give great performance for server-specific file-systems, but Brendon is right: for distributed file systems, NFS and CIFS still dominate. I don't think the appropriate answer is don't use NFS because this is ridiculous as a long term solution (NFS is common, and people are going to use it or something similar). I think the appropriate answer is to update PHP to use stat vs. open+fstat or doing something similar that would be optimized for both local AND shared file systems (I would be writing a patch instead of this email if I could). If it is of such importance to you and you are not able to do it yourself, then hire someone to do it. We may or may not get around to it, but like most things in PHP, we work on what we need ourselves and I don't think anybody here would even consider putting all their PHP files on an NFS share when performance was important. Again wrong. Apps developers don't do this because that want to; they do it because the IT services group that runs the production infrastructure has mandated standard templates for live deployment to keep the through-life cost of providing the application infrastructure manageable, with all sorts of bureaucratic exception processes if you think that your app is a special case. If you are lucky then they offer a range of EC2-like standard VM templates so that you can deploy an EBS-based approach, but most are still in catch-up mode compared to Amazons offerings. If you are a GM or a American Airlines, or the NIH for example then you will have 1,000s of applications customer facing and internal and you've got to adopt this approach for 90+% of these applications. What Brendon is asking for is reasonable, sensible and in relative terms easy to implement. However, I agree that it may be more sensible to use community effort to achieve this.
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 20/02/13 08:26, Stas Malyshev wrote: That depends of what your error handlers do. Some may write to log files, etc. if not configured properly (since error_reporting setting doesn't have to be considered in it). IIRC, for most of the cases O+ should be able to resolve all includes/requires on cached files without syscalls - but file_exists is a different matter since caching it in generic case can be very dangerous. I am not suggesting caching file_exists but rather encouraging coding patterns which avoid its use /if/ the application is intended to give good cached performance -- e.g. apps like mwiki, wordpress and drupal. I guess that I should bite the bullet and switch to 5.5. I've been working on an evaluatorfork of APC optimized for CLI/GCI which tackles a lot of these issues head on and performs reasonable well, but I realise that this is a dead-end and will never get deployed, but I am currently considering regressing some of this technology into 5.5 and O+. Are you interested in a version of O+ which supports all SAPIs? I think right now O+ can support CLI (provided enable_cli is set) but for most cases it's kind of useless since scripts are rarely re-included in common CLI scenarios. So if you know how to improve common CLI scenarios it may be interesting, though I imagine it's not the most common use case. But if it adds there without problems for anything else, why not. Let me get a dev stack for 5.5 and O+ up and then I can comment on this. As far as APC's support for CLI goes -- the cache doesn't connect to the apache2 SMA even if you run the cli process in the same UID, and since the SMA gets released when the process count goes to zero, caching doesn't really do anything as the cache discarded between cli executions, so if practice you always run uncached. As to the frequency of the use case, maybe not in the case of cli, but cgi is still tremendously common, as shared hosting providers still need to implement UID-based separation of scripting environments for different user accounts. There are still significant scaling issues with php-fpm in this usecase, so only a few SHPs offer FastCGI support and then only as a premium option. BTW, I saw that you recommended in one of your blog posts that you to use a bytecode cache if you care about performance should do. However, a lot of app maintainers who use a shared hosting service do care about performance but don't have this as an option :-( For example with my cli-based LPC opcode cache, I can do a series of make test runs (1) with the opcache disabled; (2) with it enabled but the cache empty and (3) with it enabled and the cache(s) primed. Ideally all three (*) should give the same test results, and this gives you confidence that the cache is working properly. I would like to do this with APC or O+, but in practice AFAIK, I can't do (3) so how do I or the developers know what PHP and extension features aren't being supported / working properly when the code is cached? (*) With the tests as currently written, some fail with case (3) because the runs are missing compiler warnings which are only generated if the code isn't cached (and in case of my cache at its current build quite a few fail because it still doesn't play nicely with other extensions like phar.)
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 20/02/13 23:52, Brendon Colby wrote: Terry, Thanks for your detailed input. This is essentially what I did in my test setup. APC is definitely critical to the performance of our site. I'm just very curious to see how much impact the high number of getattr operations coming from the Apache/PHP servers is having on the filer. It looks like we might be able to move our PHP files to local storage to at least measure the impact this is having. Brendon Brendon, I think that PHP is a great platform; it's just that PHP's sweet spot is just a little off the optimum for typical scalable enterprise infrastructures. IMO, the main issue that you should consider and discuss with your infrastructure support team is how to mitigate the impacts for such high-throughput business critical applications. As you (and Rasmus previously, IIRC) mention one key issue is what is on local and what is on network attached storage. On one system that I got hauled in to troubleshoot, it turned out that the main problem was that the PHP session data was being written to a directory in shared storage, causing write-though overload that ended up hitting half a dozen key apps. It was a trivial configuration change to move it to local storage transforming performance. (I apologise if I am telling how to suck eggs but this is really for others tracking this thread) the main issue here is that the file hierarchies that must have some integrity over the application tier need to be on the NAS, everything else is a matter of convenience vs performance, so for example having a cron job to rsync the apps PHP hierarchy onto local storage might well transform your app performance and give an indirect boost to cohosted apps. Happy hunting :)
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
The point that this thread highlights is that apps developers / administrators at both ends of the scale -- the enterprise and the shared service user -- normally have little say over the infrastructure architecture on which their application runs. In both these cases the infrastructure will be hosting 100s if not 1000s of apps, and the environment cannot be tailored to any one app. So issues like storage architecture and security architecture (e.g. UID-based enforcement of app separation) are a given as NFRs (non-functional requirements). Use of NFS with server / storage separation is still a standard implementation design pattern, when enterprises require simple-to-manage scalability on these tiers. We aren't going to change this, and neither will the O/P. Surely PHP will achieve better penetration and the greater acceptance if it can offer more robust performance in this sort of environment? Sure, but then you can go with something like Redis. But, again, if you go back to the original question, this has nothing to do with often-changing data in a couple of PHP include files: We have several Apache 2.2 / PHP 5.4 / APC 3.1.13 servers all serving mostly PHP over NFS (we have separate servers for static content). So they are serving up all their PHP over NFS for some reason.
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 19/02/13 09:36, Terry Ellison wrote: The point that this thread highlights is that apps developers / administrators at both ends of the scale -- the enterprise and the shared service user -- normally have little say over the infrastructure architecture on which their application runs. In both these cases the infrastructure will be hosting 100s if not 1000s of apps, and the environment cannot be tailored to any one app. So issues like storage architecture and security architecture (e.g. UID-based enforcement of app separation) are a given as NFRs (non-functional requirements). Use of NFS with server / storage separation is still a standard implementation design pattern, when enterprises require simple-to-manage scalability on these tiers. We aren't going to change this, and neither will the O/P. Surely PHP will achieve better penetration and the greater acceptance if it can offer more robust performance in this sort of environment? Sure, but then you can go with something like Redis. But, again, if you go back to the original question, this has nothing to do with often-changing data in a couple of PHP include files: We have several Apache 2.2 / PHP 5.4 / APC 3.1.13 servers all serving mostly PHP over NFS (we have separate servers for static content). So they are serving up all their PHP over NFS for some reason. Brendon, Just to follow up with a bit more detail, apart from the obvious NFS tuning with things like the actimeo mount parameters, you can get a better idea of what is going on if you use a local copy of one of your apps running under a local linux apache server. This is an example from my Ubuntu laptop, but it works just as well on a test VM if you still use WinXX on your PC. trace_wget () { sleep 1 coproc sudo strace -p $(ps -C apache2 --no-headers -o pid) -tt -o /tmp/strace$1.log sleep 1; ST_PID=$(ps -C strace --no-headers -o pid) wget http://localhost/$2; -O /dev/null -o /dev/null sleep 2; sudo kill $ST_PID } # start Apache in debug mode sudo service apache2 stop sudo bash -c . /etc/apache2/envvars; coproc apache2 -X # trace three gets trace_wget 0 phpinfo.php trace_wget 1 mwiki/index.php?title=Main_Page trace_wget 2 mwiki/index.php?title=Main_Page #restart normally sudo service apache2 stop sudo service apache2 start sudo chmod 777 /tmp/strace?.log grep -c open( /tmp/strace?.log grep -c stat( /tmp/strace?.log The first get is just to load and prime the mod_php5 thread (Ubuntu has a crazy localtime implementation), the second loads and compiles the PHP MediaWiki modules needed to render a page, the third repeats this now fully cached in ACP. In this case, we have: open()fstat()/lstat() /tmp/strace1.log 108 650 (Priming the APC cache) /tmp/strace2.log27 209 (Using the APC cache) So APC *does* materially reduce the I/O calls, and (if you look at the traces) it removes most of the mmaps and compilation. The MWiki autoloader uses require() to load classes but in the code as a whole the most common method of loading modules is require_once() (414 uses as opposed to 44 in total of the other include or require functions) and as I said in my previous post, this I/O is avoidable by recoding the ZEND_INCLUDE_OR_EVAL hander to work cooperatively with the opcode cache when present. Note that even when the cache does this, it still can't optimize coding patterns like: if ( file_exists( $IP/AdminSettings.php ) ) { require( $IP/AdminSettings.php ); } and IIRC mwiki does 4 of these on a page render. Here a pattern based on (@include($file) == 1) is more cache-friendly. Now this isn't a material performance issue when the code tree is mounted on local storage (as is invariably the case for a developer test configuration) as the stat / open overhead of a pre-(file) cached file is microseconds, but it is material on scalable production infrastructure which uses network attached NFS or NTFS file systems. So in summary, this is largely avoidable but unfortunately I don't think that we can persuade the dev team to address this issue, even if I gave them the solution as a patch :-( Regards Terry Ellison
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 19/02/13 20:32, Stas Malyshev wrote: Hi! and IIRC mwiki does 4 of these on a page render. Here a pattern based on (@include($file) == 1) is more cache-friendly. This goes back to the fact that @ does not really disable errors - it only disables reporting of the errors, but the whole message is generated and goes all the cycle up to the actual error reporting before being suppressed. I've tried to address this problem a number of times but looks like there's really no way to do it without sacrificing some parts of current error reporting such as track_errors. Yup, in this scenario we're trading off two things. What the pattern is saying is if the source file exists then compile it. However, if the file exists, in say 90+% of the cases, and the code is often cached then the file_exists check generates I/O that isn't needed if the file is already APC (or O+ / xcache) cached. Yup if the file is missing then the error is generated and trapped in the reporting cycle, but the final result is that the result is false and the false==1 fails. Other than the wasted cycles in running up this stack, there aren't any other side-effects that I am aware of. Are there any? Yes, this is an overhead, but it is small beer compared to doing a getattr() RPC across the server room fabric to a NAS server backend. So in summary, this is largely avoidable but unfortunately I don't think that we can persuade the dev team to address this issue, even if I gave them the solution as a patch :-( Could you explain what you're talking about here? The fact that you are engaging in this dialogue is great and I really appreciate it. What I am still trying to work out is how to interact with you guys in a way that is mutually productive. I did make the mistake of choosing the mainstay stable version iN live prod systems use -- 5.3 -- to take this whole issue apart and I guess that the dev team regards this as history. I guess that I should bite the bullet and switch to 5.5. I've been working on an evaluatorfork of APC optimized for CLI/GCI which tackles a lot of these issues head on and performs reasonable well, but I realise that this is a dead-end and will never get deployed, but I am currently considering regressing some of this technology into 5.5 and O+. Are you interested in a version of O+ which supports all SAPIs? In architectural terms I feel that having a universal cache option is important. It changes the mindset from something which is only used in niche performance use cases to a standard option. It also means that the cache can be stress tested by the entire php test suite. However, to do some of this you don't start with a patch, but with an RFC informed by evidence, and that's my real reason for doing this demonstrator. //Terry
Re: [PHP-DEV] PHP causing high number of NFS getattr operations?
On 18/02/13 21:47, Brendon Colby wrote: On Mon, Feb 18, 2013 at 4:32 PM, Damien Tournoud d...@damz.org wrote: Assuming that those are relative includes, can you try with: apc.canonicalize=0 apc.stat=0 Paths are absolute. stat=0 (and canonicalize=0 just to try it) produced the same result. Brendon Brendon, are your scripts doing a log of include_once / require_once calls? In you look at ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER() in Zend/zend_vm_execute.h then you will see that this Zend handler does zend_resolve_path() for any xxx_ONCE instructions and zend_stream_open() on the first request of the same. Yes APC rehooks this handler with a wrapper but this is only for lazy loading. When it honors the xxx_once instructions, it will still open the streams even if the code itself is fully cached in APC and the I/O is entirely nugatory. I suspect that this could generate the NFS traffic that you are discussing. This would be easy to avoid, but this would require replacing this handler entirely or doing dynamic code patching, nether of which APC currently does, I believe. Incidentally because this is Zend feature and nothing directly to with APC, O+ will also have this runtime characteristic. //Terry
Re: [PHP-DEV] Zend Optimizer+ Source Code now available
On 15/02/13 01:59, Stas Malyshev wrote: (A) The op-code optimization should be integrated into the core compiler and enabled through a GC(compiler_option) to be available to *any* opcode cache -- or to the application designer (by exposing these options through an INI directive. Most optimizations would not give perceivable benefit unless the optimized code is run many times. So enabling it without opcode cache would not produce very big benefit. But yes, in theory these code parts can live separately and they don't actually need each other. My point here is that APC and the other opcode caches would also benefit from these optimizations; they really belong to the compiler and not to an opcode accelerator. Also yes, this type of optimisation usually has no net benefit unless #runs / compile is much greater than one, however there are scenarios (e.g. some nasty iteration intensive batch process) where taking the hit on optimisation still produces a net runtime saving. that support it). A Zend opcode cache belongs firmly in the Zend world and shouldn't be a PHP extension. I must say I don't understand this conclusion. Put simply PHP extensions should only reference the APIs exposed in the php headers. Zend has its own interface and extensions and since a Zend Opcode cache is SO intimately coupled with the Zend environment it makes sense to use a Zend extension to implement this. The whole idea of opcode caching just isn't relevant to a Hiphop environment or even for that matter a CLR or Java one since this do their own internal caching anyway. I also note some interesting difference in approaches between O+ and APC, say for example: 1) in the handling of the ZEND_INCLUDE_OR_EVAL instruction where APC patches the op handler and O+ does peephole opcode examination. Both these workarounds could be avoided by tidying up the scope of work in the instruction code. Could you explain this a bit more? As I posted in a separate not to this list, I've am developing my own LPC extension, a fork of APC optimized for CLI and GCI use. The dogma here is that there is no point in opcode caching because performance isn't a driver. Well my experience is that performance is always a driver For example, just because your application is running on a shared service where UID enforced security must be applied and therefore some form of scalable per-UID execution environment must be used doesn't mean that you don't care about performance. LPC still give a ~ 2x throughput improvement on MediaWiki for example. I forked and rewrote this cache extension because I couldn't stretch the APC code to do what I want in a performant manner without breaking it. It's a demonstrator to help me understand the real issues here and to get to grips with the minutiae of PHP caching technologies. I just don't see this ever getting to production, but I must admit that regressing some of this technologies onto O+ is an option that interests me because this does hold out the potential of a standard optimiser extension that supports all mainstream SAPIs. Now to ZEND_INCLUDE_OR_EVAL. My rationale here is that if the admin has specified a stat=0 (or equiv) option then this is a statement that the caches content and metadata can be trusted so there is no point in examining or reading source files. However, in the case of the xxx_once variants of this instruction, the ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER() does path resolution and in the case of first load opens the source stream. Why? Why not just leave examination of the source stream to the compile function? LPC (like APC though for different reasons) has to rehook the ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLE() handler with a wrapper that does dynamic code modification to prevent this access occurring. O+ adopts a different strategy. LPC or APC having to walk through and essential patch over (private) static constant op_handler vectors is just plain horrible. No, IMO the Zend architecture should be designed to support caching; it should present a proper clean interface that extensions such as O+ and APC implement; and the entire PHP test suite should be capable of being executed with a cache extension in place and a good cache should not introduce any test failures. Sorry for the rant and I hope that I answered your Q :)
Re: [PHP-DEV] Zend Optimizer+ Source Code now available
On 14/02/13 18:24, Stas Malyshev wrote: Are optimizations documented? Not yet AFAIK. No, but they are pretty self-explanatory. O+ is a _Zend_ extension rather than a _PHP_ extension and this enables it to exploit extra hooks (see the tail of ZendAccelerator.c) and specifically follow through accel_op_array_handler() and the routines in the Optimizer subdirectory. Essentially this hook is invoked as an epilogue to the generating of any op_array. What this does is a number of peephole optimizisations to simplify and NOP out instruction sequences, and the last pass compresses the code removing dead NOPs to shrink the op_array -- this is typical of the sorts of things that the optimization passes of a compiler would do. And this is a segue into one architectural issue the immediately struck me on scanning the code: surely there is a natural domain separation between compilation, image startup/rundown, and execution. (i) is optimally done once per S/W version, (ii) per request, (iii) per instruction executed. Surely O+ is currently a hybrid of (i) and (ii) and whilst this might have occurred for understandable historical reasons, I question this rationale going forward. (A) The op-code optimization should be integrated into the core compiler and enabled through a GC(compiler_option) to be available to *any* opcode cache -- or to the application designer (by exposing these options through an INI directive. (B) The O+ opcode cache itself is logically quite separate. It makes great sense to keep ithis as a Zend extension (given the desire from some of the dev team to maintain a clear logical separation between the upper PHP environment and the Zend, Hippop, ... execution environments that support it). A Zend opcode cache belongs firmly in the Zend world and shouldn't be a PHP extension. I also note some interesting difference in approaches between O+ and APC, say for example: 1) in the handling of the ZEND_INCLUDE_OR_EVAL instruction where APC patches the op handler and O+ does peephole opcode examination. Both these workarounds could be avoided by tidying up the scope of work in the instruction code. 2) in the treatment of early binding class inheritence: APC include some reasonably complex logic to back this out; O+ sets a compiler option to disable this inheritance occurring in the first place, an approach that APC might want to copy.
Re: [PHP-DEV] double val to long val conversion issue
On 10/02/13 06:50, Stas Malyshev wrote: isn't the case with visualC, and PHP internal data structures compiled with visualC and gcc are significantly different; for example hash keys are 32 bits long on Windows and 64bits on *nix. Why aren't they 32bits, Yes, they are different, because long size is different, This my point: programmers from the *nix world tend to assume that longs are longer than ints. In the MS world they are synonymous. This is one reason why code that has been developed in the *nix can be difficult to get working reliably in the MS world. and PHP uses long (more specific, ulong) to store hash values. This is because numeric values are long, I don't follow this reasoning. Numeric value in this context is a PHP (application space) concept. Hashes are internal to the Zend EE and are never exposed to the PHP programmer, so this is a case of comparing apples and pears. All that having a 64-bit hash does (when used % nTableSize which smaller than maxint) is to add 8 bytes to every Bucket entry stored by the EE for no practical benefit. (It's 8 bytes because ulong h; uint nKeyLength; long boundary takes 16 bytes but uint32 h; uint nKeyLength; long boundary takes 8 because of data alignment.) and it's easier to use the same type for both than bother with converting back and forth. As you noted, the difference for hashing is minimal since the value is anyway brought to hash size, and hashes of size more than 32-bit LONG_MAX aren't very practical. However, it matters in other parts of the code.
Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution
Following the discussion at the end of last week, I prepared a draft RFC for the inclusion of Optimizer+ in PHP. In parallel we’re in the process of prepping the source code for independent public consumption, which I hope we can be done with by the end of next week, hopefully sooner.
Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution
On 29/01/13 08:03, Zeev Suraski wrote: Following the discussion at the end of last week, I prepared a draft RFC for the inclusion of Optimizer+ in PHP. In parallel we’re in the process of prepping the source code for independent public consumption, which I hope we can be done with by the end of next week, hopefully sooner. It's great news that Zend Technologies has decided to open-source Optimizer+ and given that it is now the end of next week I look forward to seeing on this code any day. So thanks to you for this decision. But now to specific comments on your RFC. 1. Scope of the RFC. IMO, the RFC covers four separate issues that would be easier to review, refine and agree if they were kept separate: a. Zend's decision to OS+. This is entirely within Zend Technology Inc. and outside the scope of any RFC. b. The establishment and proper architecture and support of an opcode-cache interface within the Zend Execution Engine (EE). I will discuss this below. c. The decision to include Optimizer+ as a core extension within the PHP project. However as at the time of this draft only Zend employees -- and selected Zend-approved 2nd parties who have signed the appropriate NDAs -- have access to the Optimizer+ source and are therefore able to review its content. Surely such open access is a precondition, and it makes no sense to issue an RFC to inform this decision until at least a few months after the source has been made widely available for review. d. The project decision to give any specific opcode-cache extension a preferred status over the alternative opcode-caches. Such a decision is going to be contentious and -- unless carefully, transparently and fairly managed -- could lead to conflict within the project. Not good. So I would suggest that the RFC limit itself to non-contentious claims relating to one optimizer performance over another. 2. The Detailed Content The Introduction will need redrafting depending on the proposed / revised scope of this RFC. Some form of definition / description of both a PHP opcode-cache and PHP data-cache needs including in the PHP wiki, but this would sit better under the https://wiki.php.net/internals hierarchy. This RFC should simply wiki-link to this page on the first use of [[opcode cache]]. The Interaction with other extensions and plugins section is surely a general statement of requirement that should apply to _any_ opcode cache and not just Optmizer+, so again this content belongs in separate Wiki a document with a wiki-link here. The Alternatives is really a Comparison of APC and Optimizer+ and I suggest that some points are contentious. The same point applies to the remaining sections. Surely this sort of comparison only becomes necessary when we've reach a stage where we are asking voters to choose a preferred cache, and in that case wouldn't be more appropriate to agree the selection / assessment criteria first before carrying out a selection exercise? 3. Why do I suggest an Opcode-Cache interface RFC? The current Zend 2.x engines provide some hooks which enable the main opcode caches -- including Optimizer+ and APC -- to deliver accelerated performance for many application usecases. However, some aspects of hooking an opcode cache into the Zend EE remain a somewhat of a compromise. These include: a. The management of early vs. late binding and the work-arounds that opcode caches must do to back-out unwanted early binding. b. Some essential functions that the caches must hook into are not exposed as hooks (like zend_compile_file) and are sometime implemented using static functions, leading to the cache needing to reimplement chunks of zend code. c. There should be a clear scoping separation of what the (cached) compile does and what the EE does. An example of where this is mixed is in the ZEND_INCLUDE_OR_EVAL_xxx_HANDLER functions which resolve paths and open source files in the case of the xxx_once functions. This file access is usually unnecessary in the case of cached files as the op-code cache has already cached the relevant information. Given that opcode caches are now core to PHP performance, it should be possible to implement a cache using hooks and interfaces exported through a Zend header file and without recoding bits of the engine. Optimizer+ should be an exemplar of such an approach. Regards Terry Ellison
Re: [PHP-DEV] double val to long val conversion issue
On 09/02/13 15:47, Pierre Joye wrote: hi Remi On Sat, Feb 9, 2013 at 4:10 PM, Remi Collet r...@fedoraproject.org wrote: About http://git.php.net/?p=php-src.git;a=commitdiff;h=79956330fe17cfd5f60de456497541b21a89bddf (For now, I have reverted this fix) Here some explanations. LONG_MAX is 9223372036854775807 (0x7fff) double representation of LONG_MAX is 9223372036854775808 (d LONG_MAX) is evaluated in double space. So is false for double which have the same value than (double)LONG_MAX. So, for (double)LONG_MAX the cast used is (long)d 9223372036854775807 on ppc64 9223372036854775808 on x86_64 (gcc without optimization) 9223372036854775807 on x86_64 (gcc -O2) PHP expected value is 9223372036854775808 (Btw, I don't understand why PHP, build on x86_64, with -O2, gives the good result, some environment mystery) Obviously, we could have different result on different platform, compiler, architecture. I will be very interested by result on other platform (mac, windows), compiler (Visual C), architecture. If we switch to the unsigned cast: (long)(unsigned long)d; Any comments ? IIRC, on windows/visualC, no matter if it is x86 or x64, long is always 32bits, so it won't change the size of long. See http://en.wikipedia.org/wiki/LLP64#64-bit_data_models for a good description of this mess. AFAIK many packages that target both 32 and 64 bit environments MS and *nix, define explicitly adopt XXX_int32, XXX_uint32, XXX_int64, XXX_uint64, ... datatypes and use wrappers to map these onto the appropriate visualC / gcc types. As far as I can see, PHP doesn't and seems to use long and int almost interchangeably which causes problems as LP64/I32LP64 and LLP64/IL32P64 are very different. This is one reason for 64-bit support on Windows being problematic. It would be good for PHP to have a road map to removed data model-specific potholes, say by 5.6 or 5.7. //Terry
Re: [PHP-DEV] double val to long val conversion issue
On 10/02/13 03:25, Stas Malyshev wrote: these onto the appropriate visualC / gcc types. As far as I can see, PHP doesn't and seems to use long and int almost interchangeably which PHP indeed does not use fixed-size types in zvals, etc. but it definitely does not use long and int almost interchangeably. In almost any place where int is used instead of long or vice versa (unless it is a specific small value that is nowhere near limits of either int or long and used in very restricted context) - it is a bug and should be fixed. If you know of such places, please name them or even better, submit a bug report pointing them out. Stan, you are right to correct me. Sorry. However, I still feel that the implicit assumption is that sizeof(long) == 2*sizeof(int) and this isn't the case with visualC, and PHP internal data structures compiled with visualC and gcc are significantly different; for example hash keys are 32 bits long on Windows and 64bits on *nix. Why aren't they 32bits, say, on both? (as there is no performance benefit is having 64bit hash keys when the maximum size of a hash table is an int).
[PHP-DEV] A quick intro for Terry Ellison and my extension work
I've posted a few times to this list and may do so some more, so as a courtesy I though that I should give a short intro on me and what I am doing here. By way of a personal background, I retired early from HP five years ago due to ill-health though I have since recovered. I was a Distinguished SE for what that was worth, and have been programming C/C++ and PHP for a LONG time. Now I am a gentleman of leisure (AKA old fart) and now do it for pleasure only. One thing about PHP that has always puzzled me was that we've never developed a good opcode caching solution for CGI and CLI use as the various opcode caches are add-on extensions rather than PHP core and really targeted for high-volume single UID environments. I see two main problems with this: 1. Multi-account service providers must employ UID based mandatory separation of processes, and no other shared approach meets even minimal security requirements. 2. Whilst opcode caching only applies and an option for some PHP runtime environments and not all, it is not properly architected into the PHP compile and run environment, and hence the opcode caches -- by necessity -- seem to include quite a few clumsy workarounds to be able to run. Opcode caching or at least a clear interface to such caches should be a PHP core feature for all execution modes. What I've been doing for the last six months is to develop a Lightweight Program Cache (LPC) extension optimised for CGI and CLI use to get a better understanding of the problems and possible approaches to address these. LPC started life as a fork of APC, but has turned into a stripped-down rewrite. I see this primarily as a demonstrator and a vehicle for my really getting to grips with the internals of the PHP compiler and execution engine. If anyone is interest or wants to get an overview of how PHP code caches work have a read of: https://github.com/TerryE/php-extensions/blob/master/lpc/TECHNOTES.txt I doubt that it will ever be anything more than beta code so I don't need a VCS account or any other karmas. Nonetheless it does essentially work -- it roughly doubles the throughput of MediaWiki with no memory leaks. However, it is a long way away from being at a level where I would suggest anyone pull the repository and plays with it. I am still at Zend 2.3 and I need still need to roll in the extra 2.4 functional changes. It fails the PHP test suite for some extensions, and I am currently tracking down a bunch of issues from the php5/Zend/test failures (yup -- one advantage of running in CLI mode is that I can test the extension against the entire PHP test suite in two passes -- one to build the per-script caches and a repeat to run against the cache version). Anyway, regards to you all, Terry Ellison
Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()
On 04/02/13 10:57, Ángel González wrote: snip The memory will stop growing (on my machine) at ~2491584 bytes and the loop is able to run forever, creating each eval() furthermore uniqe ano-function's but not endless-filling Zend-internal tables. but this still leaves the function record itself in the function_table hash so with a non-zero reference count and this doesn't get DTORed until request shutdown Not familar with the Zend-internals but just about so i was imaging and expecting it. That why i [still] also confused/wondering why in the 2nd example the memory will not grow *endless*. It seems that the function records in the function_table will be DTORed (or similar cleaned up) before request-shutdown at some point... Could this be the case? As you are reassigning $ano_fnc, the old closure is being destructed. Had you used create_function(), it wouldn't happen. Now the question is, if it is correctly freeing the functions (and it is good that it does so), why is it not doing it when they have different lengths? It's a bug. The Closure class DTOR does not delete the derefenced function from the CG(function_table). If you did the eval at line 20 in say /tmp/xx.php then the INCLUDE_OR_EVAL instruction calls the Zend compiler with the args: (1) the source to be compiled and (2) the title /tmp/x.php(20) : eval()'d code The compiler than gives the closure function a magic name: \0{closure}/tmp/x.php(20) : eval()'d code0x where 0x is the hex address of the function substring in the evaluated string. The compiler uses a zend_hash_update to insert this into the CG(function_table). What happens if you use a fixed length string replacing another string of the same length dropping its refcount to 0 is that the allocator is clever and will tend to reallocate the old one and hence the address of the string is the same and the address of the offset of the function substring is the same so it regenerates the same magic name -- pretty much as an accidental side-effect. When this happens, it's this hash update function that calls the DTOR on any pre-existing function with this name. I simply put a breakpoint on the relevant line in the zend_do_begin_function_declaration() code and if you used a fixed offset into the same string you only got one {closure} entry. If the allocation ended up randomizing the address, then the {closure} entries grew until memory exhaustion. As I said -- interesting. Need to think about the consequences before I submit a bugrep. Regards Terry
Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()
Hi Terry and all thank you very much for your response. The only thing that confused me about what you say that the second *doesn't* grow Yes, about that i was [and am still :-)] also confused... why the 2nd one won't grow *non-stop* so I checked and it does -- just the same as the first. Right, it grows, but not non-stop as in the 1st one. The memory will stop growing (on my machine) at ~2491584 bytes and the loop is able to run forever, creating each eval() furthermore uniqe ano-function's but not endless-filling Zend-internal tables. but this still leaves the function record itself in the function_table hash so with a non-zero reference count and this doesn't get DTORed until request shutdown Not familar with the Zend-internals but just about so i was imaging and expecting it. That why i [still] also confused/wondering why in the 2nd example the memory will not grow *endless*. It seems that the function records in the function_table will be DTORed (or similar cleaned up) before request-shutdown at some point... Could this be the case? OK, Hans-Jürgen, this one has got me interested. I am developing a fork of APC optimized for cgi and cli use -- but that's a different topic -- though understanding the DTOR processes for compiler objects interests me because of this. I'll go through the code and especially for Closure objects to understand why. However thinking through this logically: 1. The fact that the second does stop growing means that that the reassignment of the global $ano sets the RC of the previous closure object to zero triggering DTOR of the lamba function. 2. There is something pathological about the first case which is frustrating garbage collection on the lambda function DTOR. I replaced your inner loop by: $len = $argv[1] 1 ? $argv[2] : mt_rand(1, $argv[2]); $str = '. . bin2hex(fread($fp, $len)) . '; if ($argv[1] 2) $str = function() {\$y = $str; };; eval (\$x = $str;); echo Mem usage: .memory_get_usage().\n; to allow me to use arg1 to select one of the four test cases 0..3 and arg2 is the (max) string size, n say. This clearly shows the fact that the memory explosion only occurs if the string is allocated *inside* the lambda function. 4. If you substitute n15 then memory growth rapidly stabilises for PHP 5.3.17 0, but still explodes for n14 5. In the case of PHP 5.4.6 a similar effect occurs except that explosion occurs at n11. 6. The fact that 5.3 and 5.4 are different is notable -- however, the fact that 5.4 is still (eventually) stable for n12 means that this isn't a string interning issue. Interesting. Merits more research :-)
Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution
On 30/01/13 00:54, Rasmus Lerdorf wrote: On 01/29/2013 04:47 PM, Stas Malyshev wrote: Hi! which shows the dreaded zend_optimizerplus.inherited_hack which mimics APC's autofilter hack. I'd love to get rid of this particular bit of confusion/code complexity on the integration. Ohh, this one. IIRC that has to do with conditional definition of classes and the fact that script may be compiled in one environment but loaded in another, which may create difference in class tables, especially combined with early binding for inherited classes. Getting rid of it is not that easy until people stop writing code like: if($foo) return; class Foo extends Bar {} which would work differently depending on if Bar is defined or not. Yes, I am quite familiar with it since we had to handle this in APC too. But I don't think getting rid of it is that hard. It obviously can't be done in the opcode cache because by the time the compiler hands us the op_array we have already lost the FETCH_CLASS opcode which we may or may not need. We need to look at whether that MAKE_NOP() call in the compiler is actually a worthwhile optimization in a future where most people will be running an opcode cache by default. This is one of the prime examples of making the compiler more opcode cache friendly. Yes, it may be at the slight expense of non-opcode cache performance, but with a bundled opcode cache implementation that should be less of a worry. +1. This one makes no sense to me as it simply hoists the zend_do_inheritance() from runtime binding to compile-time, and this binding has to be backed out by any opcode cache to work properly. It might save a few microseconds per class declaration in non-cached performance, but add factors more to cached performance. Why do this?