Re: [PHP-DEV] Re: Always set return_value_ptr?

2013-08-31 Thread Terry Ellison

On 31/08/13 14:13, Nikita Popov wrote:

Is there any particular reason why we only pass return_value_ptr to internal 
functions if they have the ACC_RETURN_REFERENCE flag set?

Why can't we always provide the retval ptr, even for functions that don't 
return by reference? This would allow returning zvals without having to copy 
them first (what RETVAL_ZVAL does).

Motivation for this is the following SO question: 
http://stackoverflow.com/q/17844379/385378


Changes merged. Small benchmark to verify that this indeed avoids the copy: 
https://gist.github.com/nikic/6398090 :)


Nikita,  IMO, this is a material performance optimisation of the PHP 
internals, as it removes one of the most common unnecessary (expensive) 
copies. So thanks for this.


It will be interesting to see the benefit on real apps such as 
MediaWiki.  I'll pull a 5.5 snapshot and compare it to 5.5.2 :-)


Regards
Terry



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Always set return_value_ptr?

2013-08-30 Thread Terry Ellison

On 30/08/13 10:43, Julien Pauli wrote:
On Fri, Aug 30, 2013 at 2:30 AM, Terry Ellison 
ellison.te...@gmail.com mailto:ellison.te...@gmail.com wrote:



There's another one in string.c, in PHP_FUNCTION(pathinfo), that
could be applied as well, though there's little performance gain
in avoiding the copy of a 4 element string array.

BTW, looking at this pathinfo code, it doesn't do what the
documentation says it does -- or at least this states that the
optional argument if present should be _one_ of PATHINFO_DIRNAME,
PATHINFO_BASENAME, PATHINFO_EXTENSION or PATHINFO_FILENAME.
However, if a bitmask is supplied then this function returns the
element corresponding to the lowest bit value rather than an error
return, for example:

$ php -r 'echo pathinfo(/tmp/x.fred,
PATHINFO_FILENAME|PATHINFO_EXTENSION),\n;'
fred

This is a bizarre behaviour.   At a minimum the documentation
should actually state what the function does. Or do we bother to
raise a patch to fix this sort of thing, given that returning an
empty string (or more consistently with other functions, NULL) in
this case could create a BC break with existing buggy code?


This is weird, yes.
It's not the lowest bit value that is returned, but the first element 
put in the array (as zend_hash_get_current_data() is used with no 
HashPosition) , which is even more confusing.


How to explain that in the documentation ? :|


Yes I understand that, but the code processes the elements in this 
dirname, basename, filename, extension order so the two statements are 
equivalent in implementation.


I am an experienced developer but a newbie-ish to the PHP developer 
community, and I come back to my Q.  What do we typically do if we come 
across such weird functional behaviour outside the documented use of a 
standard function?


* Shrug our shoulders and say That's PHP for you.  BC rules
* Fix the documentation to say what the code actually does
* Fix the code at the next major release, say 5.6 to have sensible error 
behaviour.


Just interested in understanding the consensus policy here.  Do I post a 
fix to the doc; post a fix to the code; or move on to other issues?


Regards Terry


Re: [PHP-DEV] Re: Always set return_value_ptr?

2013-08-29 Thread Terry Ellison

On 27/08/13 10:40, Nikita Popov wrote:

On Sat, Aug 3, 2013 at 8:16 PM, Nikita Popov nikita@gmail.com wrote:


Hi internals!

Is there any particular reason why we only pass return_value_ptr to
internal functions if they have the ACC_RETURN_REFERENCE flag set?

Why can't we always provide the retval ptr, even for functions that don't
return by reference? This would allow returning zvals without having to
copy them first (what RETVAL_ZVAL does).

Motivation for this is the following SO question:
http://stackoverflow.com/q/17844379/385378


Patch for this change can be found here:
https://github.com/php/php-src/pull/420

The patch also adds new macros to allow easy use of this feature called
RETVAL_ZVAL_FAST/RETURN_ZVAL_FAST (anyone got a better name?)

If no one objects I'll merge this sometime soon.
+1 Though looking through the ext uses, most functions returning an 
array build it directly in return_value and thus avoid the copy.  I also 
see that you've picked up all of the cases in ext/standard/array.c where 
these macros can be applied.


There's another one in string.c, in PHP_FUNCTION(pathinfo), that could 
be applied as well, though there's little performance gain in avoiding 
the copy of a 4 element string array.


BTW, looking at this pathinfo code, it doesn't do what the documentation 
says it does -- or at least this states that the optional argument if 
present should be _one_ of PATHINFO_DIRNAME, PATHINFO_BASENAME, 
PATHINFO_EXTENSION or PATHINFO_FILENAME. However, if a bitmask is 
supplied then this function returns the element corresponding to the 
lowest bit value rather than an error return, for example:


$ php -r 'echo pathinfo(/tmp/x.fred, 
PATHINFO_FILENAME|PATHINFO_EXTENSION),\n;'

fred

This is a bizarre behaviour.   At a minimum the documentation should 
actually state what the function does. Or do we bother to raise a patch 
to fix this sort of thing, given that returning an empty string (or more 
consistently with other functions, NULL) in this case could create a BC 
break with existing buggy code?


Regards
Terry

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?

2013-08-20 Thread Terry Ellison

Johannes,
Thanks, but I'll make some responses

On Mon, 2013-08-19 at 17:05 +0100, Terry Ellison wrote:

By way of a background.  I've been doing a review of the exting code
base looking at how to establishing a roadmap extend OPcache
functionality across all supported OSes and SAPIs.  And this raises a
supplementary Q: which OSs and SAPIs should we be supporting for PHP 5.6
anyway?  I would be interested in the  views of the dev team on this.

It would be good to agree a list of which OSs are to be supported at PHP

The short version is quite simple: PHP supports everything and nothing.

Our aim is to be portable and have it running anywhere somebody has a C 
compiler and the required libs. On the other hand, in open source spirit, we 
promise nothing.

In reality I expect that most developers use Linux and we have a active Windows 
fraction.

If we promise support for any platform it has two direct consequences:
  - We have to test and verify it
  - We immediately disappoint people who run PHP successfully on edge
platforms

And then more longterm consequences:
  - Mind new platforms
  - Continuous discussions about adding support for new platforms

The current model otoh works quite well.
I understand the nuances of what support means in the FLOSS world, but 
at some level we also be able to look at ourselves in the mirror and say 
that are releases are at a standard that we can feel comfortable with.  
As you say, we have an established Linux base as well as some Windows.  
I would also add a solid BSD user base (FreeBSD, NetBSD, OSX, etc.).   
(Maybe I'd include Solaris, but it's on the way out given Oracle's 
position.)


But what about all of the other obsolete platform code?  We ship this 
with our PHP source, version after version, knowing that its never being 
exercised or tested.  Surely if and when we want to improve PHP's 
platform support and architecture, then this stuff is like chains 
dragging at out feet.  For example, I know how to make OPcache work well 
for other SAPIs such as cli, mod_fcgid, etc. but only if I can refactor 
the necessary chunks and only for POSIX and Win32 which covers the 
platforms we've just discussed; worrying about of the other flavours 
that I could never test just gives me too much brain ache, but I can't 
propose a code refactor if I don't know what to do with it.


PHP (unlike some language alternatives) seems to be doing little to 
improve general performance, and the discussions related to performance 
on this DL are almost non-existent.

5.6, which SAPIs are supported, and a matrix of which SAPIs are
supported on non-threaded and build TSRM variants.

I myself would kill TSRM, but others have reasons to disagree ;-)

In general: There are features which are dependent on operating system, 3rd 
party library or TSRM. This is fine. Based on my statement from above I claim 
(again, there are people who disagree for reasons I follow less than the 
general case above) that nobody who cares about performance uses TSRM, as such 
an opcode cache is not needed in such environments.
Yes, if the Zend engine had first been developed for Windows, then it 
would have supported proper multi-threading from Day 1.  It wasn't, so 
TSRM is a cludge to achieve this.  Dmitry made the comment somewhere 
that enabling TSRM incurs a ~20% performance hit, which I can believe, 
but as far as I can see, WIN32 implementations rely on it for scaling.


Looking at fpm, cgi, etc. all of the SAPIs which rely on a master/child 
process hierarchies for scaling use fork, and they all have a big 
#ifndef ZEND_WIN32 around this code.  OK, CreateProcess incurs more 
startup overhead than forking and there are other startup issues to 
address relating to acquisition of context, but I've written serious 
realtime systems for WIN32 in the past that performed well without 
threads. So I am not sure why this is the case.


However the lack of OPcode caching on complex apps typically halves 
system throughput.  Most if not all admins care about that sort of hit.

Examples of what I am talking about are SAPIs with no clear evidence of
active support (I've listed the last non-bulk change in brackets to give
a measure of the level of support):
  aolserver (2008), caudium (2005), continuity (2004), nsapi (2011),
  phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002),
  tux (2007), webjames (2006)
I realise that some of these may still be actively used with a user
community out there wanting to track current versions, and this is just
a case of if ain't broke...  However, I do wonder when some of these
were actively maintained and routinely tested against the current
versions at release -- and if not then perhaps PHP 5.6 is the correct
point to retire them from the source tarball and configure options?

First thing to note is that the SAPI layer is one of the most stable ones. So 
old SAPIs most likely work.
Secondly: Yes some of them almost certainly can go, when we discussed

Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?

2013-08-20 Thread Terry Ellison

On 20/08/13 16:50, Johannes Schlüter wrote:

snip Terry Ellison wrote:

PHP (unlike some language alternatives) seems to be doing little to
improve general performance, and the discussions related to
performance on this DL are almost non-existent.

Looking at any benchmark from 5.2 to 5.3 to 5.4 and 5.5 shows notable
improvements (5.4 to 5.5 maybe not as much as the others) saying we do
little is a bit misleading.

But well, it is simpler to do these syntax sugar things we're
bikeshedding about than doing actual core improvements. We have just
very few people fully understanding the engine and being able to improve
it. So such discussions gain no traction.
I apologise if this sounded as unreasonably critical, as this wasn't my 
intent.  As it happens, my particular interest is in PHP performance and 
I've got a good understanding of the Zend Engine and opcache, but trying 
to work out how I can contribute effectively to this is difficult for me 
given this lack of traction.


I also know that Dmitry and you guys made some fundamental improvements 
to the 5.4 engine that significantly dropped the op_array sizing and 
lead to perhaps an overall 5-15% performance improvement.  I discussed 
this in some depth on my OPcache documentation on this page:


https://github.com/TerryE/opcache/wiki/The-Zend-Engine-and-opcode-caching#wiki-Comments_on_Zend_engine_performance

However, I don't think that this is appreciated in the wider PHP 
community (for example I can't recall it ever being discussed or 
emphasised on StackOverflow).  I feel that it got lost in the reaction 
to APC not working reliably with the early 5.4 dot releases.


Regards Terry


[PHP-DEV] Which OSs and SAPI should PHP 5.6 support?

2013-08-19 Thread Terry Ellison
By way of a background.  I've been doing a review of the exting code 
base looking at how to establishing a roadmap extend OPcache 
functionality across all supported OSes and SAPIs.  And this raises a 
supplementary Q: which OSs and SAPIs should we be supporting for PHP 5.6 
anyway?  I would be interested in the  views of the dev team on this.


It would be good to agree a list of which OSs are to be supported at PHP 
5.6, which SAPIs are supported, and a matrix of which SAPIs are 
supported on non-threaded and build TSRM variants.


Examples of what I am talking about are SAPIs with no clear evidence of 
active support (I've listed the last non-bulk change in brackets to give 
a measure of the level of support):

aolserver (2008), caudium (2005), continuity (2004), nsapi (2011),
phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002),
tux (2007), webjames (2006)
I realise that some of these may still be actively used with a user 
community out there wanting to track current versions, and this is just 
a case of if ain't broke...  However, I do wonder when some of these 
were actively maintained and routinely tested against the current 
versions at release -- and if not then perhaps PHP 5.6 is the correct 
point to retire them from the source tarball and configure options?


Likewise in the Zend, TSRM, ext/opcache ... sources, there is 
conditional code dependent on BeOS, __sgi, __osf__,  __IRIX__, NSAPI, 
PI3WEB, GNUPTH(*), OS_VXWORKS,  etc. as well as obsolete BSD versions -- 
OSs that are no longer actively supported.  Again I ask the Q how and 
when are these tested and if not then shouldn't we retire this support?


Part of my reasons for asking this is work in preparation for OPcache 
issue #118 -- Transparent SHM reuse.  Doing this with robustly with good 
performance characteristics -- for *all* currently referenced OSs -- is 
a pain.  Reviewing a range of other best-of breed packages which use 
shared SMA-based resources, it seems to me that the memcached approach 
is the cleanest:  it uses the POSIX APIs and supports any OSes which 
support these APIs.  If we limited TSRM and OPcache support at PHP 5.6 
to two code variants, POSIX + WIN32, surely this would still cover all 
the major supported OSes?


//Terry Ellison

(*) GNU threads is still supported but it prevents utilisation of SMP 
systems and there is a minimal performance differences from POSIX 
threads on a single processor system.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?

2013-08-19 Thread Terry Ellison

Uwe Schindler wrote:

I would update NSAPI as I always did, there were just no new bugs and code is very stable 
(to the extend of stableness of multithreaded SAPIs). It is still also in use 
on some of my servers, so I would still help support it.
UWE, If its used on some of your servers, and you are supporting it then 
it doesn't belong on my suggested list.  :-)

At the moment I did not follow recent commits to SAPI-related code, so I have 
to closer look into it. Are there any RFCs related to changes coming in 5.6 for 
OPcache?

Not currently.

-Original Message-
From: Terry Ellison [mailto:ellison.te...@gmail.com]
Sent: Monday, August 19, 2013 6:05 PM
To: internals@lists.php.net
Subject: [PHP-DEV] Which OSs and SAPI should PHP 5.6 support?

By way of a background.  I've been doing a review of the exting code base
looking at how to establishing a roadmap extend OPcache functionality
across all supported OSes and SAPIs.  And this raises a supplementary Q:
which OSs and SAPIs should we be supporting for PHP 5.6 anyway?  I would
be interested in the  views of the dev team on this.

It would be good to agree a list of which OSs are to be supported at PHP 5.6,
which SAPIs are supported, and a matrix of which SAPIs are supported on
non-threaded and build TSRM variants.

Examples of what I am talking about are SAPIs with no clear evidence of
active support (I've listed the last non-bulk change in brackets to give a
measure of the level of support):
  aolserver (2008), caudium (2005), continuity (2004), nsapi (2011),
  phttpd (2002), pi3web (2003), roxon (2002), thttpd (2002),
  tux (2007), webjames (2006)
I realise that some of these may still be actively used with a user community
out there wanting to track current versions, and this is just a case of if 
ain't
broke...  However, I do wonder when some of these were actively
maintained and routinely tested against the current versions at release -- and
if not then perhaps PHP 5.6 is the correct point to retire them from the
source tarball and configure options? ...




Re: [PHP-DEV] execute compressed PHP command-line application

2013-07-18 Thread Terry Ellison

crankypuss wrote:

... I don't want to have to modify the interpreter at this point...

Sorry, but this list is for just this purpose, so you post does belong 
on the DL.


Regards Terry

PS. read up on PHAR extensions and use of streams.  There's nothing 
stopping you specifying a phar or even a compress.zlib stream on the 
commandline.


Re: [PHP-DEV] Moving PHP documentation to Git repository

2013-06-25 Thread Terry Ellison

On 25/06/13 07:46, Christian Stoller wrote:

What do you think about moving the PHP documentation to a Git repository, 
mirrored on Github? Doing this would make it possible for everybody to extend 
the documentation easily by creating pull requests.

Today one has to get an SVN account to edit the docu or you have to use 
https://edit.php.net/ which does not work as expected (at least for me when I 
tried to update some German documentation). My changes have not been integrated 
for some months (I had to write an email to somebody of the doc team to apply 
the changes).

Symfony does it this way (see https://github.com/symfony/symfony-docs/) and I 
like it very much. It is really easy to extend/update parts of the docu which 
are not complete or outdated and I am sure that it is comfortable and 
timesaving for the doc team, too.


+1

Regards Terry Ellison


Re: [PHP-DEV] Request for testers of and feedback on GCI-enabled OPcache

2013-06-22 Thread Terry Ellison

On 22/06/13 00:01, Martin Amps wrote:
When do you expect to have support for 5.5? I’d be happy to test it on 
a few of our servers as soon as you are


Martin Amps | CIO
www.iCracked.com http://www.icracked.com/
iCracked | Redwood City, CA

Martin,
The latest commit supports PHP 5.5.  Only had to change one #ifdef
//Terry


On Jun 21, 2013, at 3:52 AM, Terry Ellison ellison.te...@gmail.com 
mailto:ellison.te...@gmail.com wrote:


The Multi-Level Cache (MLC) OPcache fork typically delivers 80% of 
the performance acceleration of standard OPcache for the CLI and GCI 
SAPI modes. (OPCache and other cache accelerators don't functionally 
support these modes). The last build is now pretty stable in that it 
runs runs the PHP test suite and MediaWiki under GCI happily. It also 
has greater savings for the I/O load associated with script 
compilation for these modes. (In other SAPI modes, it runs the 
standard OPcache functionality and therefore delivers 100% of the 
OPcache benefits).


However, I now need others actively to evaluate this alpha code and 
give feedback on its performance and its configuration interface if 
we are going to move towards promoting the introduction of this or 
some variant thereof into the PHP core. So my request here is to 
those CGI SAPI mode users on these lists to help support this work. 
Many of you have complained about the poor performance of PHP in CGI 
and now is your opportunity to help address this. You will find an 
overview of MLC OPcache at:


https://github.com/TerryE/opcache/wiki/MLC-OPcache-details

and can pull the latest code from:

https://github.com/TerryE/opcache/archive/dev-filecache.zip

If you would like to help then please download, build and try out 
this version, respond tohttps://github.com/TerryE/opcache/issues/3and 
use the Github issues tracker for MLC-OPcache-specific discussion. 
Only use these mailing lists for comment that you feel has wider 
interest to the list subscribers.


Thank-you and regards
Terry Ellison

Caveat: I've only tested this Alpha version on 64bit Linux 
configurations for PHP 5.3 and 5.4, and would therefore like to limit 
initial testing to these configurations at this stage.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:http://www.php.net/unsub.php





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Request for testers of and feedback on GCI-enabled OPcache

2013-06-21 Thread Terry Ellison
The Multi-Level Cache (MLC) OPcache fork typically delivers 80% of the 
performance acceleration of standard OPcache for the CLI and GCI SAPI 
modes. (OPCache and other cache accelerators don't functionally support 
these modes). The last build is now pretty stable in that it runs runs 
the PHP test suite and MediaWiki under GCI happily. It also has greater 
savings for the I/O load associated with script compilation for these 
modes. (In other SAPI modes, it runs the standard OPcache functionality 
and therefore delivers 100% of the OPcache benefits).


However, I now need others actively to evaluate this alpha code and give 
feedback on its performance and its configuration interface if we are 
going to move towards promoting the introduction of this or some variant 
thereof into the PHP core. So my request here is to those CGI SAPI mode 
users on these lists to help support this work. Many of you have 
complained about the poor performance of PHP in CGI and now is your 
opportunity to help address this. You will find an overview of MLC 
OPcache at:


https://github.com/TerryE/opcache/wiki/MLC-OPcache-details

and can pull the latest code from:

https://github.com/TerryE/opcache/archive/dev-filecache.zip

If you would like to help then please download, build and try out this 
version, respond to https://github.com/TerryE/opcache/issues/3 and use 
the Github issues tracker for MLC-OPcache-specific discussion. Only use 
these mailing lists for comment that you feel has wider interest to the 
list subscribers.


Thank-you and regards
Terry Ellison

Caveat: I've only tested this Alpha version on 64bit Linux 
configurations for PHP 5.3 and 5.4, and would therefore like to limit 
initial testing to these configurations at this stage.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Internal object orientation documentation available!

2013-06-12 Thread Terry Ellison

On 10/06/13 19:33, Nikita Popov wrote:

We just published some rather extensive documentation on internal object
orientation:

 http://www.phpinternalsbook.com/classes_objects.html

This is part of a larger project aimed at documenting the engine and making
it accessible to new contributors.

This looks like an excellent beginning so thanks.  A few general comments:

1)  I notice that your book is  © Copyright 2013, Julien Pauli - 
Anthony Ferrara - Nikita Popov.  All Rights Reserved rather than GDFL 
or one of the CC variants of open document licences.  They only issue 
that I see here is that I -- and possibly others -- might be a bit 
guarded in providing comment and input if that content was being 
transferred to the authors unconditionally.  Also if you are reserving 
all rights then you will need to be careful to ensure that all the 
content is yours and not extracted from an open or other 3rd party 
source.  Surely this going to add to your authoring burden?


2)  Wikipedia, for example, contains a lot of good in-depth explanation 
of CompSci concepts and standard patterns such as 
http://en.wikipedia.org/wiki/Hash_table. You might consider the content 
cut: when you include basic discussion of 101 principles (e.g. on 
HashTables); and when you limit
your content to their PHP-specific implementation, with suitable 
references to the 101 stuff. Tending to the former will make the book a 
lot longer, albeit standalone.  Your call, but I would have thought that 
the majority of the readership by nature will have some CompSci 
background and so want to skip the 101 stuff, or be referenced out to 
the appropriate in-depth WP or other reference.


3) What is your preferred markup format for feedback and contributions?  
E.g.  do you maintain an ODF or Docbook XML under some accessible git 
repository, or is is a case of (for example)


   hashtables/basic_structure.html para at line 138.  Not quite true that
   the arBucket array will never shrink down: you can not reduce a PHP
   array, you only can grow it.  You can always implement your own
   resizer by realloing the arBucket array and the calling
   zend_hash_rehash() to do this. (This would be a good standard hash
   API function by the way.

But good luck and this will be an extremely useful project to help those 
wishing to get to grips with PHP internals.


Regards
Terry

(Resend including internals list)

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Multi-level Caching Fork of OPcache -- update

2013-06-11 Thread Terry Ellison

On 10/06/13 09:20, Dmitry Stogov wrote:

Sorry for slow response.
I'm very busy with other work and have no time for MLC OPcache review.
I don't think we can include it into main tree before 5.5.0 release 
anyway.

But in general I think we may include your work in the future releases.

Also, thanks for useful reports about problems you've found in OPcache :)

Thanks. Dmitry.


Dmitry,

One useful side-effect of writing the MLC support is that I've really 
had to take apart the core OPcache code to understand how it works.  
It's probably the first in-depth review that this extension has had from 
someone _outside_ the Zend team, so its only to be expected that anyone 
doing this would find a few issues.


What I do think needs to be said it that I think that you guys have done 
a fantastic job here in this development.  9 times out of 10 when I've 
initially thought why didn't they do it this way? when digging into 
the code, I've dug down in and discovered that you already had, or had 
approached it a better way.  IMO, the whole OPcache approach is tighter 
and more sound than that of APC.  Take one example of this: the 2-pass 
algo for compiled scripts which enables the storage for a compiled 
script be to allocated as a single storage unit.  This has two major 
performance benefits at runtime:


1)  The memory allocator overheads of preparing scripts for execution 
(and deallocation at rundown) are reduced by more than an order of 
magnitude.
2)  The memory needed to execute the script is in a contiguous memory 
areas, and this gives improved hardware (as in L1/L2/L3) caching which 
passes through to a runtime performance improvement.


There are a couple of things that I would refactor if I had written 
OPcache.  (I'll raise a couple of issues on these to discuss what I mean 
in more depth.  and when the MLC work reaches a plateau if you think its 
worthwhile I can cut a couple of branches to show you a possible solution.)

A)  The SMA startup bootstrap is just messy and needs refactoring.
B)  The simple dead-and-rebirth method of refreshing caches isn't going 
to scale well on real systems.


Terry
(Note the new email addr that I am using for php.net work, as this one 
isn't being blocked by the php.net server)


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Multi-level Caching Fork of OPcache -- update

2013-05-08 Thread Terry Ellison

Dmitry,

Hi Terry,

I don't have time right now (on this week), but I'll definitely take a 
look into your patch later.


Thanks. Dmitry.


Hi and thanks for this.  I won't have the full functionality in place 
for another month or so, though my pushes to my github repository should 
be fully functional on the main path and subject to caveats in the TODO, 
etc., so it's just more general guidance when you get time, e.g I would 
be happier if you approached X this way, or don't forget to address 
issue Y which we've been burnt on in the past...


Also have a scan through the wiki pages for B/G design info.  If you 
guys want, I could also do the equivalent for standard OPcache down the 
line, since I know have a pretty intimate knowledge of how it works; I 
would just need to know the target audience that you would like to address.


Regards
Terry



On Sat, May 4, 2013 at 5:29 PM, Terry Ellison te...@ellisons.org.uk 
mailto:te...@ellisons.org.uk wrote:


Please treat this email by way of request for feedback from the
OPcache developers and anyone interested in influencing my next
steps on my https://github.com/TerryE/opcache fork of OPcache and
specifically on the dev-filecache branch.  The most appropriate
channel is probably https://github.com/TerryE/opcache/issues --
unless you think that the comments have wider applicability for
either the PECL or DEV communities.

My ultimate aim is to take this to a point where the OPcache
developers feel sufficiently comfortable to consider merging a
future version back into OPcache.  I have added some detailed
project wiki pages documenting my scope and progress and in
particular on
https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and a
brief quote from the page:

An indication of the potential performance benefits of OPcache
CLI mode can be seen from a simple benchmark based on 100
executions of the MediaWiki runJobs.php maintenance batch
script. This compiles some 44 PHP sources, comprising 45K
lines and 1,312 Kbytes. The cached version reads a single
runJobs.cache file of 1,013 Kbytes.

 Time in mSec Average  Stdev
 Uncached Execution 179  7
 Cached Execution77  7
 (Image Load Overhead)   18  3


In other words for this script, the MLC cache is delivering an
approximate 60% runtime saving.  Of course this is only a point
test, and benefits will vary -- though I hope that switching to
LZ4 compression will improve these figures further.  But even this
one point challenges what seems to be a core PHP development
dogma: there's no point in using a file cache, because it makes
no material performance difference.   Even this build *does*
deliver material benefits , and I suggest that there is merit in
moving to including MLC cached modes to accelerate CLI and GCI
SAPI modes using this or a similar approach.

From an internals -- rather than PECL -- viewpoint what this
would mean is that non-cached incremental compile-and-go execution
modes would now be the exception than the norm -- largely negating
the disadvantages of any compile-intensive optimization options.

So thank-you in anticipation for your feedback.  I will do my
utmost to respond constructively to all comments. :-)
Regards Terry Ellison

PS.  Apologies in advance:  I am up country at my cottage on an
Island in the north Aegean with the nearest Wifi some walk away,
so my Internet access is limited at the moment, and I might take
some time to respond.






[PHP-DEV] Multi-level Caching Fork of OPcache -- update

2013-05-04 Thread Terry Ellison
Please treat this email by way of request for feedback from the OPcache 
developers and anyone interested in influencing my next steps on my 
https://github.com/TerryE/opcache fork of OPcache and specifically on 
the dev-filecache branch.  The most appropriate channel is probably 
https://github.com/TerryE/opcache/issues -- unless you think that the 
comments have wider applicability for either the PECL or DEV communities.


My ultimate aim is to take this to a point where the OPcache developers 
feel sufficiently comfortable to consider merging a future version back 
into OPcache.  I have added some detailed project wiki pages documenting 
my scope and progress and in particular on 
https://github.com/TerryE/opcache/wiki/MLC-OPcache-details and a brief 
quote from the page:


An indication of the potential performance benefits of OPcache CLI 
mode can be seen from a simple benchmark based on 100 executions of 
the MediaWiki runJobs.php maintenance batch script. This compiles some 
44 PHP sources, comprising 45K lines and 1,312 Kbytes. The cached 
version reads a single runJobs.cache file of 1,013 Kbytes.


 Time in mSec Average  Stdev
 Uncached Execution 179  7
 Cached Execution77  7
 (Image Load Overhead)   18  3


In other words for this script, the MLC cache is delivering an 
approximate 60% runtime saving.  Of course this is only a point test, 
and benefits will vary -- though I hope that switching to LZ4 
compression will improve these figures further.  But even this one point 
challenges what seems to be a core PHP development dogma: there's no 
point in using a file cache, because it makes no material performance 
difference.   Even this build *does* deliver material benefits , and I 
suggest that there is merit in moving to including MLC cached modes to 
accelerate CLI and GCI SAPI modes using this or a similar approach.


From an internals -- rather than PECL -- viewpoint what this would mean 
is that non-cached incremental compile-and-go execution modes would now 
be the exception than the norm -- largely negating the disadvantages of 
any compile-intensive optimization options.


So thank-you in anticipation for your feedback.  I will do my utmost to 
respond constructively to all comments. :-)

Regards Terry Ellison

PS.  Apologies in advance:  I am up country at my cottage on an Island 
in the north Aegean with the nearest Wifi some walk away, so my Internet 
access is limited at the moment, and I might take some time to respond.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Continuous Integration Atomic Deploys and PHP 5.5

2013-03-24 Thread Terry Ellison

Rasmus,

snip

- Request 1 starts before the deploy and loads script A, B
- Deploy to a separate directory and the docroot symlink now points to here
- Request 2 starts and loads A, B, C
- Request 1 was a bit slow and gets to load C now
The issues that you raise about introducing atomic versioning in the 
script namespace do need to be addressed to avoid material service 
disruption during application version upgrade.  However, surely another 
facet of the O+ architectural also frustrates this deployment model.


My reading is that is that O+ processes each new (cache-miss) compile 
request by first sizing the memory requirements for the compiled source 
and then allocating a single brick from (one of) the SMA at its high 
water mark.  Stale cache entries are marked as corrupt and their storage 
is then allocated to wasted_shared_memory with no attempt to reuse it.  
SMA exhaustion or the % wastage exceeding a threshold ultimately 
triggers a process shutdown cascade.  This  strategy is lean and fast 
but as far as I understand this, it ultimately uses a process death 
cascade and population rebirth to implement garbage collection.


Wouldn't your non-stop models would require a more stable reuse 
architecture which recycles wasted memory stably without the death 
cascade?  Perhaps one of the Zend team could correct my inference if 
I've got it wrong again :-(


Regards
Terry


[PHP-DEV] Re: [PECL-DEV] [Proposal] New Extension Yac (a user data cache base on shared memory without locks)

2013-03-23 Thread Terry Ellison

On 23/03/13 06:29, Laruence wrote:

   since Zend O+ has bundled into PHP since 5.5, and O+ is really a
bit faster than APC,

   so people may want to migrate to O+, but there is no User Data
Cache in O+ ...
Laurence, you are correct that O+ doesn't provide data caching, but what 
about memcached and the PECL packages that support it? 
http://pecl.php.net/package/memcache and 
http://pecl.php.net/package/memcached


Regards
Terry


Re: [PHP-DEV] Re: [PECL-DEV] [Proposal] New Extension Yac (a user data cache base on shared memory without locks)

2013-03-23 Thread Terry Ellison

On 23/03/13 09:46, Matīss Roberts Treinis wrote:

Memcached is distributed caching system, where as APC's user data
cache is not. Memcached requires separate server instance (memcached)
to operate. APC does not.
Yes, but there is nothing to stop an admin of an application-dedicated 
system or VM configuring and using an in-server memcached.

Also, APC's user cache is 5+ times faster
than memcached. If some extension is to provide this functionality, it
has to be as close as possible in possibilities and speed as APC's
implementation has. Memcached is not and never hasn't been an
alternative for APC, they are meant for two different jobs.
I also agree that memcache is slower because it is out of process and 
that for some usecases the relative speed differences due to these 
context switches will impact application performance.  Yes, they have 
different sweet-spots and operational characteristics, but for many 
usecases the relative impact will be immaterial, and memcached can be a 
perfectly acceptable substitute.


Applications which are closely coupled to high APC data cache usage will 
probably stay with APC for the foreseeable future.


An SMA-based data cache would be a useful adjunct to O+, so I will be 
interested in this, but I just don't see this filling a show-stopper gap 
that must be addressed as a priority.

snip
Laurence, you are correct that O+ doesn't provide data caching, but what
about memcached and the PECL packages that support it?
http://pecl.php.net/package/memcache and
http://pecl.php.net/package/memcached




Re: [PHP-DEV] O+ support for PHP 5.x (Was current Status of O+ on Windows)

2013-03-02 Thread Terry Ellison

On 02/03/13 08:39, Zeev Suraski wrote:

The current vote that's going on right now deals with putting the extension
into PHP itself.  If that happens (which seems awfully likely at this
point), why do we need it in PECL?
My response to your Q is that there is probably going to be quite a lot 
of interest in an O+ package that is usable with PHP 5.3 and 5.4.  
Surely a PECL package will have a quicker uptake terms of getting it out 
into the wider PHP developers community and into production, especially 
if the main Linux distros add a precompiled php5-optimizer-plus package 
(or whatever their naming convention is).


Would you see such O+ support for the existing supported versions best 
done through the PECL route or swept up into a maintenance dot release?


Regards Terry


[PHP-DEV] Optimizer+ bugreps

2013-03-02 Thread Terry Ellison
At what point is O+ reporting going to be possible through 
https://bugs.php.net/ ?


I realize that this is a bit of a catch-22, but surely it would be 
better to allow properly tracked open bug reporting sooner rather later?


Regards Terry


Re: [PHP-DEV] Optimizer+ bugreps

2013-03-02 Thread Terry Ellison

On 02/03/13 09:34, Pierre Joye wrote:


Having it in peck right now allows that. But as of now it is not a 
PHP.net project so it makes little sense to have it listed there.


On Mar 2, 2013 10:33 AM, Terry Ellison te...@ellisons.org.uk 
mailto:te...@ellisons.org.uk wrote:


At what point is O+ reporting going to be possible through
https://bugs.php.net/ ?

I realize that this is a bit of a catch-22, but surely it would be
better to allow properly tracked open bug reporting sooner rather
later?



Thanks Pierre, I understand and that's why I mentioned catch-22. AFAIK, 
there's no open bug and issue reporting available prior to its formal 
adoption, event though we all realize that it's going to be pretty much 
inevitable -- for compelling reasons -- and by the time it is adopted 
the first release will be a fait accompli.


Re: [PHP-DEV] Optimizer+ bugreps

2013-03-02 Thread Terry Ellison

On 02/03/13 17:42, Christopher Jones wrote:



I realize that this is a bit of a catch-22, but surely it would be
better to allow properly tracked open bug reporting sooner rather
later?



Bugs can (and have been) reported via 
https://github.com/zend-dev/ZendOptimizerPlus/issues

I'm sure email reports will also do fine in the interim.

I guess this is a case of Du, my bad.  Zeev gave the github URI in 
his initial announcement.  I should have done a 1+1 ...


Thanks Chris,  sometimes an =2 is very useful :oops;

//Terry


Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()

2013-02-25 Thread Terry Ellison

On 03/02/13 15:27, Hans-Juergen Petrich wrote:
In this example (using php-5.4.11 on Linux) the memory will grow 
non-stop:


for ( $fp = fopen('/dev/urandom', 'rb'); true;) {
eval ('$ano_fnc = function() {$x = '.bin2hex(fread($fp, 
mt_rand(1, 1))).';};');

echo Mem usage: .memory_get_usage().\n;
}


But in this example not:

for ( $fp = fopen('/dev/urandom', 'rb'); true;) {
eval ('$ano_fnc = function() {$x = '.bin2hex(fread($fp, 
1)).';};');

echo Mem usage: .memory_get_usage().\n;
}

Hans-Juergen, I've raised a bugrep https://bugs.php.net/bug.php?id=64291 
which you might want to review and add any appropriate comments.  I had 
to think about this one. It's worthwhile observing that this second 
example is the only occasion, as far as I know, that PHP does any 
garbage collection of code objects before request shutdown.  For example 
create_function() objects are given the name \0LambdaN where N is the 
count of the number of created functions so far in this request.  They 
are registered in the function table and persist until request 
shutdown.  That's the way PHP handles them by design.


As I said in the bugrep, the normal behaviour of persistence is what you 
want because if you think about the normal use of the anonymous 
function, say


while (!feof($fp)) {
$line = preg_replace_callback(
'|p\s*\w|',
function($matches) {return strtolower($matches[0]);},
gets($fp)
);
echo $line;
}

Then the anonymous function is compiled once and rebound to the closure 
object which is passed as the
second argument for the callback each time through the loop.  OK, doing 
the closure CTOR/DTOR once per loop. is not the cleverest of ideas and 
this is the sort of thing that would be hoisted out of the loop in a 
language which did such optimization, but PHP doesn't. It's a LOT better 
that compiling a new function each loop (which is how Example #1 on 
http://php.net/manual/en/function.preg-replace-callback.php does it!)   
This is what you want to happen.


It's just too complicated for PHP to work out if the function might or 
not be rebound to.  I suspect the bug here is really the implicit 
assumption that the magic function name generated by the


eval ('$ano_fnc = function() { ... }');

is unique, but as your examples shows, thanks to garbage collection and 
reuse of memory, sometimes it isn't.  In these circumstances thank to 
the use of a hash update and the table DTOR the old one is deleted.


So assume that what you are doing is exploiting a bug, so my advice is 
not to do this.  It might be fixed in a future release.


Regards Terry


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-22 Thread Terry Ellison

Ramus, thanks for your detailed response.

NFS is so common for sharing files that ...

This is simply not true. I do have a fair bit of experience in this
field, and I don't know of any major sites that do this and I have
worked with a good chunk of the largest sites out there.

Eh???  Fortune 500 enterprises and governmental departments are pretty
conservative.  NAS and SAN based iSCSI and FCoE based elastic block
storage give great performance for server-specific file-systems, but
Brendon is right: for distributed file systems, NFS and CIFS still
dominate.

By major I meant traffic-wise, not Fortune-500, although there are some
of those on the list too. I mostly work with medium-to-large scale
Internet companies. Think Yahoo, Facebook, Flickr, Digg, Etsy, WePay,
Room77. These types of companies would never consider serving all their
Web traffic from NFS. Yes, Yahoo had a ton of Netapp filers as well, but
this was for shared data storage, they would never consider putting
their application logic on them.
Now I agree with you: for this sector of Internet B2C companies, their 
business is centred around a small number of apps that dominate their 
revenue streams, so of course they are free to design their 
infrastructure architecture to optimize the performance of these apps. I 
also accept that this sector was and is directly or indirectly the major 
funder of PHP development effort.


However, my counter point is that this is no longer the only 
infrastructure usecase for PHP.  Now mature, it has entered other 
sectors and Brendon and Daniel posts highlight two of them:


 * Enterprise use as Brendon raises.  Enterprises have moved to use
   internet based technologies to automate internal business
   processes.  These apps work on the company intranet, not on the
   internet. So when you book a car or go to your bank or order a part
   from a manufacturer, the assistant may well be sitting in front of a
   PHP app that never sees the internet but is still core to that
   business.  Thanks partly to the flexibility of cloud resources, CIOs
   and CTOs are increasingly looking at open technologies such PHP to
   replace MS ones.  Incidentally IMO, its this sort of business stream
   that will provide hard funding to value-add companies such as Zend.

 * The hosting service providers as Daniel raises.  In terms of sheer
   numbers this is that largest community of PHP users who buy their
   +/- $100pa service from a hosting provider.  They still care about
   performance.  The providers care about the efficiency of their
   infrastructure.  They (initially) using PHP because Wordpress,
   Mediawiki, ... are written in it. But this is also a major entry
   vehicle for a new generation of PHP developers to get an initial
   internet presence.  If PHP runs 3x slower than language X, and X is
   just as flexible then we are putting up unnecessary barriers to
   their entry and turning away that new cadre.


This is also something that has been like this for 10+ years and nobody
has stepped up to fix it so far. It shouldn't be news to anyone that
stats and opens over NFS are slow. I am not sure why it should suddenly
be an urgent problem for us at this point. But like I said, we may get
to it.
It's not suddenly urgent but perhaps this is more a question of maybe 
hitting a tipping point where it might now be wise to address this issue.

  If the integrated opcode cache happens it becomes easier to
manage the flow between the compiler, the cache and the executor and we
can probably optimize some things there.

+1

And as I mentioned in another thread, let's see some RFCs proposing how
to fix some of these things rather than simply posting I wish the PHP
devs would do this.. type of messages. These go over really badly with
most of the longtime contributors here and they even tend to have the
opposite of the desired effect.
As I have posted separately, I forked and then rewrote APC to address 
this sweet spot.  OK my LPC is very a much bug-ridden alpha code that 
fails 10% of the PHP test suite largely due to extension 
interoperability issues, and I've had other things to do this last month 
-- including deciding whether to switch to a proper O+ delta. However, 
my aim was for me to use this as an evaluation test bed, not a serious 
production contender.  However, now that I've written an opcode cache 
which runs Mediawiki under php-cgi (with ~ 5% of the NFS getattrs, BTW), 
rolling some key tweeks into the Zend compiler, execution environment 
and APC -- which I understand well and should be straight forward -- or 
O+ -- which I don't as yet.


My challenge is deciding (i) do I work on PHP 5.6 / 5.7 and the 
corresponding beta APC version which at current rates of adoption might 
have begin to have an impact in the community sometime in the next 5 
years, or (ii) work on a performance patch to the stable APC version 
which is typically installed with PHP 5.3 which these guys could apply 
within a few months.


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-22 Thread Terry Ellison

On 22/02/13 11:20, Ferenc Kovacs wrote:


My challenge is deciding (i) do I work on PHP 5.6 / 5.7 and the
corresponding beta APC version which at current rates of adoption
might have begin to have an impact in the community sometime in
the next 5 years, or (ii) work on a performance patch to the
stable APC version which is typically installed with PHP 5.3 which
these guys could apply within a few months.


or contribute those patches back and integrate them into the vanilla apc?
Humm.   I think that we are sort of saying the same thing, but at cross 
purposes. Of course I should offer any up patches for mainstream APC and 
at best these will go into 3.1.14 or 3.1.15 and may then get adopted 
sometime for production systems whenever -- that's only if the release 
of a core O+ doesn't drop APC into legacy status.


However Ubuntu 12.04-LTS is a good example of a stable production stack 
and this uses PHP 5.3.10 and APC 3.1.7.  Debian Squeeze is even further 
behind and it runs PHP 5.3.3 and APC 3.1.3.


A performance patch could also be made available based on the last 
stable version of APC, say 3.1.9 -- that is before the attempts to 
support the new PHP 5.4 features destabilised it.  With this patch, then 
at least individual system admins would have the option to download a 
stable version from PECL + patch  it to use with their production stacks 
within the next 3-6 months.


Regards
Terry


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-22 Thread Terry Ellison

On 19/02/13 01:30, Kevin Yung wrote:

In our environment, we use NFS for shared storage, we are using APC as well
with stat=0. In our setting, we also experiencing high number of stat()
calls on our file system. My initial finding of this problem is we enabled
the open_basedir setting. And there is already a bug report for this,
https://bugs.php.net/bug.php?id=52312

We tested the issue in 5.2.x, 5.3.x and 5.4.x, all of them experiencing
same issue.
Kevin, I've just walked through this in 5.3 and 54 and updated this 
bugrep.  In short there is some silly coding here which should be 
addressed.  Even if we accept that PHP should comply with 
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-5178 if 
open_basedir is set, then the cache should only be ignored on the actual 
open itself, as this is the only one that is exploitable, but let's have 
this debate on the bugrep.  Let me think about the security and other 
NFRs and propose a patch.


Re: [PHP-DEV] Give the Language a Rest motion (fwd)

2013-02-21 Thread Terry Ellison
Here is a counterpoint to that expressed by Lars.  Many if not most 
shared hosting providers don't offer PHP 5.4 yet.  Ditto many 
enterprises have yet to adopt it.  The main reason?  I think its that 
old Backwards Compatibility issue that has been discussed heavily on 
this DL.


When major apps like mediaWiki break with a new release of PHP (see 
http://www.mediawiki.org/wiki/Compatibility, and this is quite typical), 
upgrading PHP versions represents a major headache for both hosting 
providers and larger enterprises that want  to maintain standard 
infrastructure build templates, as each none-BC PHP upgrade represents a 
major cost in either loss of customer satisfaction or IT investment for 
little or no tangible business benefit.


New features are often nice for the app developer, so the result is that 
they then get used by apps development teams, and the provider or 
infrastructure team then has to manage the ripple effects on a complex 
infrastructure permitted configuration matrix across hundreds or 
thousands of apps.


I am not saying that the PHP dev team should freeze PHP, but what I am 
suggesting is that the PHP team should also consider the compatibility 
impacts across versions so that enterprises and hosting providers who 
have adopted PHP can control their through-life maintenance costs.


There are things that the PHP team could do to help mitigate this issue 
-- for example producing standard templates so that, say PHP 5.3, 5.4 
and 5.5 based apps can coexist and perform (e.g. with APC or O+) on the 
*same* Apache2 (or nginx, ...) stack.


Change is good, but too much change too fast without regard to the cost 
consequence will ultimately alienate the CIOs and CTOs who set platform 
policies.



On 20/02/13 23:15, Lars Strojny wrote:

As a general reply: I’d like to disagree, and here is why. Yes, we should not 
let half baked features in but we need to add more and more features, also 
syntax wise. For three reasons:


  - Parity/expectations/me too: so you can do that in PHP as well
  - Expressiveness: allow better ways to express the same idea in more concise 
ways
  - Innovation: bring unexpected features to the language people didn’t even 
expect


Let’s recall a few of the latest additions:


  - 5.3: namespaces. Provided the foundation for awesome stuff like PSR-1, 
which in turn provides the foundation for the even more awesome stuff composer 
is.
  - 5.3: goto. A good thing we can do it. I'm not sure for what exactly but I 
am sure there is somebody out there :)
  - 5.3: Closures, huge thing for us, a matter of parity to other languages. 
Really changes the face of a lot of APIs (see e.g. Doctrine transactional(), 
the whole micro framework movement, React)
  - 5.4: Closures 2.0 with $this binding. Honestly, without it, Closures are a 
little meh. But it was good we waited and got it right.
  - 5.4: Short array syntax. A parity/culture issue.
  - 5.4: Traits, I am happy we got horizontal reuse right
  - 5.4: array dereferencing. Very small but useful. To me it feels more like a 
bugfix
  - 5.4: callable type hint. Small change with a huge impact
  - 5.5: Generators, also a matter of parity and a matter of awesomeness
  - 5.5: ClassName::class syntax. A really good improvement to the overall 
usability of namespaces. Just imagine how much shorter unit test setUp() 
methods will become


What we have on our list that, from my perspective, will sooner or later hit us:


  - Property accessors in some form or another: a lot of people seem to like it.
  - Annotation support: we would have a lot of customers for it.
  - Autoboxing for primitives. Allows us to fix a lot of problems in 
ext/standard.
  - Unicode. Obviously.
  - Named parameters. A recurring topic might be a topic worth digging deeper.
  - I'm positive the Generics discussion will arise at some point again.


… and these are just the changes on a syntax/semantics level, I'm not talking 
about all the awesome technologies (one of which you are working on) we need to 
integrate tighter and eventually bundle with core. I don’t believe we should 
let our users outgrow the language, quite the opposite, we should grow with our 
users and the broader web community, otherwise we will fail. PHP is nowadays 
used for tasks it never was intended to be used but that’s a good thing. We 
need to continuously adapt. What’s true for software projects is true for 
languages: stop improving actually reduces its value over time.

cu,
Lars




Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-21 Thread Terry Ellison

On 21/02/13 23:38, Rasmus Lerdorf wrote:

On 02/21/2013 03:15 PM, Brendon Colby wrote:


NFS is so common for sharing files that saying Wow, people are still
serving web files over NFS? is like saying Wow, people are still
using the ls command to list directory contents on Linux? I think NFS
is still very widely used, even for sharing web files.

This is simply not true. I do have a fair bit of experience in this
field, and I don't know of any major sites that do this and I have
worked with a good chunk of the largest sites out there.
Eh???  Fortune 500 enterprises and governmental departments are pretty 
conservative.  NAS and SAN based iSCSI and FCoE based elastic block 
storage give great performance for server-specific file-systems, but 
Brendon is right: for distributed file systems, NFS and CIFS still 
dominate.



I don't think the appropriate answer is don't use NFS because this
is ridiculous as a long term solution (NFS is common, and people are
going to use it or something similar). I think the appropriate answer
is to update PHP to use stat vs. open+fstat or doing something similar
that would be optimized for both local AND shared file systems (I
would be writing a patch instead of this email if I could).

If it is of such importance to you and you are not able to do it
yourself, then hire someone to do it. We may or may not get around to
it, but like most things in PHP, we work on what we need ourselves and I
don't think anybody here would even consider putting all their PHP files
on an NFS share when performance was important.
Again wrong.  Apps developers don't do this because that want to; they 
do it because the IT services group that runs the production 
infrastructure has mandated standard templates for live deployment to 
keep the through-life cost of providing the application infrastructure 
manageable, with all sorts of bureaucratic exception processes if you 
think that your app is a special case.  If you are lucky then they offer 
a range of EC2-like standard VM templates so that you can deploy an 
EBS-based approach, but most are still in catch-up mode compared to 
Amazons offerings.


If you are a GM or a American Airlines, or the NIH for example then you 
will have 1,000s of applications customer facing and internal and you've 
got to adopt this approach for 90+% of these applications.


What Brendon is asking for is reasonable, sensible and in relative terms 
easy to implement.  However, I agree that it may be more sensible to use 
community effort to achieve this.


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-20 Thread Terry Ellison

On 20/02/13 08:26, Stas Malyshev wrote:
That depends of what your error handlers do. Some may write to log 
files, etc. if not configured properly (since error_reporting setting 
doesn't have to be considered in it). IIRC, for most of the cases O+ 
should be able to resolve all includes/requires on cached files 
without syscalls - but file_exists is a different matter since caching 
it in generic case can be very dangerous. 
I am not suggesting caching file_exists but rather encouraging coding 
patterns which avoid its use /if/ the application is intended to give 
good cached performance -- e.g. apps like mwiki, wordpress and drupal.



I guess that I should bite the bullet and switch to 5.5.  I've been
working on an evaluatorfork of APC optimized for CLI/GCI which tackles a
lot of these issues head on and performs reasonable well, but I realise
that this is a dead-end and will never get deployed, but I am currently
considering regressing some of this technology into 5.5 and O+.  Are you
interested in a version of O+ which supports all SAPIs?

I think right now O+ can support CLI (provided enable_cli is set) but
for most cases it's kind of useless since scripts are rarely re-included
in common CLI scenarios. So if you know how to improve common CLI
scenarios it may be interesting, though I imagine it's not the most
common use case. But if it adds there without problems for anything
else, why not.
Let me get a dev stack for 5.5 and O+ up and then I can comment on 
this.  As far as APC's support for CLI goes -- the cache doesn't connect 
to the apache2 SMA even if you run the cli process in the same UID, and 
since the SMA gets released when the process count goes to zero, caching 
doesn't really do anything as the cache discarded between cli 
executions, so if practice you always run uncached.


As to the frequency of the use case, maybe not in the case of cli, but 
cgi is still tremendously common, as shared hosting providers still need 
to implement UID-based separation of scripting environments for 
different user accounts.  There are still significant scaling issues 
with php-fpm in this usecase, so only a few SHPs offer FastCGI support 
and then only as a premium option. BTW, I saw that you recommended in 
one of your blog posts that you to use a bytecode cache if you care 
about performance should do. However, a lot of app maintainers who use a 
shared hosting service do care about performance but don't have this as 
an option :-(


For example with my cli-based LPC opcode cache, I can do a series of  
make test runs (1) with the opcache disabled; (2) with it enabled but 
the cache empty and (3) with it enabled and the cache(s) primed.  
Ideally all three (*) should give the same test results, and this gives 
you confidence that the cache is working properly.  I would like to do 
this with APC or O+, but in practice AFAIK, I can't do (3) so how do I 
or the developers know what PHP and extension features aren't being 
supported / working properly when the code is cached?


(*) With the tests as currently written, some fail with case (3) because 
the runs are missing compiler warnings which are only generated if the 
code isn't cached (and in case of my cache at its current build quite a 
few fail because it still doesn't play nicely with other extensions like 
phar.)


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-20 Thread Terry Ellison

On 20/02/13 23:52, Brendon Colby wrote:

Terry,

Thanks for your detailed input. This is essentially what I did in my test setup.

APC is definitely critical to the performance of our site. I'm just
very curious to see how much impact the high number of getattr
operations coming from the Apache/PHP servers is having on the filer.
It looks like we might be able to move our PHP files to local storage
to at least measure the impact this is having.

Brendon


Brendon,

I think that PHP is a great platform; it's just that PHP's sweet spot is 
just a little off the optimum for typical scalable enterprise 
infrastructures.  IMO, the main issue that you should consider and 
discuss with your infrastructure support team is how to mitigate the 
impacts for such high-throughput business critical applications. As you 
(and Rasmus previously, IIRC) mention one key issue is what is on local 
and what is on network attached storage.


On one system that I got hauled in to troubleshoot, it turned out that 
the main problem was that the PHP session data was being written to a 
directory in shared storage, causing write-though overload that ended up 
hitting half a dozen key apps.  It was a trivial configuration change to 
move it to local storage transforming performance.  (I apologise if I am 
telling how to suck eggs but this is really for others tracking this 
thread) the main issue here is that the file hierarchies that must have 
some integrity over the application tier need to be on the NAS, 
everything else is a matter of convenience vs performance, so for 
example having a cron job to rsync the apps PHP hierarchy onto local 
storage might well transform your app performance and give an indirect 
boost to cohosted apps.  Happy hunting :)


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-19 Thread Terry Ellison
The point that this thread highlights is that apps developers / 
administrators at both ends of the scale -- the enterprise and the 
shared service user -- normally have little say over the infrastructure 
architecture on which their application runs.  In both these cases the 
infrastructure will be hosting 100s if not 1000s of apps, and the 
environment cannot be tailored to any one app.  So issues like storage 
architecture and security architecture (e.g. UID-based enforcement of 
app separation) are a given as NFRs (non-functional requirements). Use 
of NFS with server / storage separation is still a standard 
implementation design pattern, when enterprises require simple-to-manage 
scalability on these tiers.  We aren't going to change this, and neither 
will the O/P.


Surely PHP will achieve better penetration and the greater acceptance if 
it can offer more robust performance in this sort of environment?

Sure, but then you can go with something like Redis.

But, again, if you go back to the original question, this has nothing to
do with often-changing data in a couple of PHP include files:

   We have several Apache 2.2 / PHP 5.4 / APC 3.1.13 servers all serving
   mostly PHP over NFS (we have separate servers for static content).

So they are serving up all their PHP over NFS for some reason.





Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-19 Thread Terry Ellison

On 19/02/13 09:36, Terry Ellison wrote:
The point that this thread highlights is that apps developers / 
administrators at both ends of the scale -- the enterprise and the 
shared service user -- normally have little say over the 
infrastructure architecture on which their application runs.  In both 
these cases the infrastructure will be hosting 100s if not 1000s of 
apps, and the environment cannot be tailored to any one app.  So 
issues like storage architecture and security architecture (e.g. 
UID-based enforcement of app separation) are a given as NFRs 
(non-functional requirements). Use of NFS with server / storage 
separation is still a standard implementation design pattern, when 
enterprises require simple-to-manage scalability on these tiers.  We 
aren't going to change this, and neither will the O/P.


Surely PHP will achieve better penetration and the greater acceptance 
if it can offer more robust performance in this sort of environment?

Sure, but then you can go with something like Redis.

But, again, if you go back to the original question, this has nothing to
do with often-changing data in a couple of PHP include files:

   We have several Apache 2.2 / PHP 5.4 / APC 3.1.13 servers all serving
   mostly PHP over NFS (we have separate servers for static content).

So they are serving up all their PHP over NFS for some reason.




Brendon,

Just to follow up with a bit more detail, apart from the obvious NFS 
tuning with things like the actimeo mount parameters, you can get a 
better idea of what is going on if you use a local copy of one of your 
apps running under a local linux apache server.  This is an example from 
my Ubuntu laptop, but it works just as well on a test VM if you still 
use WinXX on your PC.


   trace_wget () {
sleep 1
coproc sudo strace -p $(ps -C apache2 --no-headers -o pid) -tt -o 
/tmp/strace$1.log
sleep 1; ST_PID=$(ps -C strace --no-headers -o pid)
wget http://localhost/$2; -O /dev/null -o /dev/null
sleep 2; sudo kill $ST_PID
   }
   # start Apache in debug mode
   sudo service apache2 stop
   sudo bash -c . /etc/apache2/envvars; coproc apache2 -X

   # trace three gets
   trace_wget 0 phpinfo.php
   trace_wget 1 mwiki/index.php?title=Main_Page
   trace_wget 2 mwiki/index.php?title=Main_Page

   #restart normally
   sudo service apache2 stop
   sudo service apache2 start
   sudo chmod 777 /tmp/strace?.log

   grep -c open( /tmp/strace?.log
   grep -c stat( /tmp/strace?.log

The first get is just to load and prime the mod_php5 thread (Ubuntu has 
a crazy localtime implementation), the second loads and compiles the PHP 
MediaWiki modules needed to render a page, the third repeats this now 
fully cached in ACP.  In this case,

we have:

  open()fstat()/lstat()
   /tmp/strace1.log   108   650  (Priming the APC cache)
   /tmp/strace2.log27   209  (Using the APC cache)

So APC *does* materially reduce the I/O calls, and (if you look at the 
traces) it removes most of the mmaps and compilation.  The MWiki 
autoloader uses require() to load classes but in the code as a whole the 
most common method of loading modules is require_once() (414 uses as 
opposed to 44 in total of the other include or require functions) and as 
I said in my previous post, this I/O is avoidable by recoding the 
ZEND_INCLUDE_OR_EVAL hander to work cooperatively with the opcode cache 
when present.


Note that even when the cache does this, it still can't optimize coding 
patterns like:


if ( file_exists( $IP/AdminSettings.php ) ) {
require( $IP/AdminSettings.php );
}

and IIRC mwiki does 4 of these on a page render.  Here a pattern based 
on (@include($file) == 1) is more cache-friendly.


Now this isn't a material performance issue when the code tree is 
mounted on local storage (as is invariably the case for a developer test 
configuration) as the stat / open overhead of a pre-(file) cached file 
is microseconds, but it is material on scalable production 
infrastructure which uses network attached NFS or NTFS file systems.


So in summary, this is largely avoidable but unfortunately I don't think 
that we can persuade the dev team to address this issue, even if I gave 
them the solution as a patch :-(


Regards
Terry Ellison


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-19 Thread Terry Ellison

On 19/02/13 20:32, Stas Malyshev wrote:

Hi!


and IIRC mwiki does 4 of these on a page render.  Here a pattern based
on (@include($file) == 1) is more cache-friendly.

This goes back to the fact that @ does not really disable errors - it
only disables reporting of the errors, but the whole message is
generated and goes all the cycle up to the actual error reporting before
being suppressed. I've tried to address this problem a number of times
but looks like there's really no way to do it without sacrificing some
parts of current error reporting such as track_errors.
Yup, in this scenario we're trading off two things.  What the pattern is 
saying  is if the source file exists then compile it. However, if the 
file exists, in say 90+% of the cases, and the code is often cached then 
the file_exists check generates I/O that isn't needed if the file is 
already APC (or O+ / xcache) cached.  Yup if the file is missing then 
the error is generated and trapped in the reporting cycle, but the final 
result is that the result is false and the false==1 fails.  Other than 
the wasted cycles in running up this stack, there aren't any other 
side-effects that I am aware of. Are there any?


Yes, this is an overhead, but it is small beer compared to doing a 
getattr() RPC across the server room fabric to a NAS server backend.

So in summary, this is largely avoidable but unfortunately I don't think
that we can persuade the dev team to address this issue, even if I gave
them the solution as a patch :-(

Could you explain what you're talking about here?
The fact that you are engaging in this dialogue is great and I really 
appreciate it.  What I am still trying to work out is how to interact 
with you guys in a way that is mutually productive.  I did make the 
mistake of choosing the mainstay stable version iN live prod systems use 
-- 5.3 -- to take this whole issue apart and I guess that the dev team 
regards this as history.


I guess that I should bite the bullet and switch to 5.5.  I've been 
working on an evaluatorfork of APC optimized for CLI/GCI which tackles a 
lot of these issues head on and performs reasonable well, but I realise 
that this is a dead-end and will never get deployed, but I am currently 
considering regressing some of this technology into 5.5 and O+.  Are you 
interested in a version of O+ which supports all SAPIs?


In architectural terms I feel that having a universal cache option is 
important.  It changes the mindset from something which is only used in 
niche performance use cases to a standard option.  It also means that 
the cache can be stress tested by the entire php test suite.  However, 
to do some of this you don't start with a patch, but with an RFC 
informed by evidence, and that's my real reason for doing this demonstrator.


//Terry


Re: [PHP-DEV] PHP causing high number of NFS getattr operations?

2013-02-18 Thread Terry Ellison

On 18/02/13 21:47, Brendon Colby wrote:

On Mon, Feb 18, 2013 at 4:32 PM, Damien Tournoud d...@damz.org wrote:

Assuming that those are relative includes, can you try with:

   apc.canonicalize=0
   apc.stat=0

Paths are absolute. stat=0 (and canonicalize=0 just to try it)
produced the same result.

Brendon



Brendon, are your scripts doing a log of include_once / require_once 
calls?  In you look at ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER() in 
Zend/zend_vm_execute.h then you will see that this Zend handler does 
zend_resolve_path() for any xxx_ONCE instructions and zend_stream_open() 
on the first request of the same.  Yes APC rehooks this handler with a 
wrapper but this is only for lazy loading.  When it honors the xxx_once 
instructions, it will still open the streams even if the code itself is 
fully cached in APC and the I/O is entirely nugatory.  I suspect that 
this could generate the NFS traffic that you are discussing.


This would be easy to avoid, but this would require replacing this 
handler entirely or doing dynamic code patching, nether of which APC 
currently does, I believe.  Incidentally because this is Zend feature 
and nothing directly to with APC,  O+ will also have this runtime 
characteristic.


//Terry


Re: [PHP-DEV] Zend Optimizer+ Source Code now available

2013-02-15 Thread Terry Ellison

On 15/02/13 01:59, Stas Malyshev wrote:

(A) The op-code optimization should be integrated into the core compiler
and enabled through a GC(compiler_option) to be available to *any*
opcode cache -- or to the application designer (by exposing these
options through an INI directive.

Most optimizations would not give perceivable benefit unless the
optimized code is run many times. So enabling it without opcode cache
would not produce very big benefit. But yes, in theory these code parts
can live separately and they don't actually need each other.
My point here is that APC and the other opcode caches would also benefit 
from these optimizations; they really belong to the compiler and not to 
an opcode accelerator.  Also yes, this type of optimisation usually has 
no net benefit unless  #runs / compile is much greater than one, however 
there are scenarios (e.g. some nasty iteration intensive batch process) 
where taking the hit on optimisation still produces a net runtime saving.

that support it).  A Zend opcode cache belongs firmly in the Zend world
and shouldn't be a PHP extension.

I must say I don't understand this conclusion.
Put simply PHP extensions should only reference the APIs exposed in the 
php headers.  Zend has its own interface and extensions and since a Zend 
Opcode cache is SO intimately coupled with the Zend environment it makes 
sense to use a Zend extension to implement this.  The whole idea of 
opcode caching just isn't relevant to a Hiphop environment or even for 
that matter a CLR or Java one since this do their own internal caching 
anyway.



I also note some interesting difference in approaches between O+ and
APC, say for example:

1) in the handling of the ZEND_INCLUDE_OR_EVAL instruction where APC
patches the op handler and O+ does peephole opcode examination.  Both
these workarounds could be avoided by tidying up the scope of work in
the instruction code.

Could you explain this a bit more?
As I posted in a separate not to this list, I've am developing my own 
LPC extension, a fork of APC optimized for CLI and GCI use.  The dogma 
here is that there is no point in opcode caching because performance 
isn't a driver.   Well my experience is that performance is always a 
driver  For example, just because your application is running on a 
shared service where UID enforced security must be applied and therefore 
some form of scalable per-UID execution environment must be used doesn't 
mean that you don't care about performance.  LPC still give a ~ 2x 
throughput improvement on MediaWiki for example.


I forked and rewrote this cache extension because I couldn't stretch the 
APC code to do what I want in a performant manner without breaking it.  
It's a demonstrator to help me understand the real issues here and to 
get to grips with the minutiae of PHP caching technologies.  I just 
don't see this ever getting to production, but I must admit that 
regressing some of this technologies onto O+ is an option that interests 
me because this does hold out the potential of a standard optimiser 
extension that supports all mainstream SAPIs.


Now to ZEND_INCLUDE_OR_EVAL.  My rationale here is that if the admin has 
specified a stat=0 (or equiv) option then this is a statement that the 
caches content and metadata can be trusted so there is no point in 
examining or reading source files.  However, in the case of the xxx_once 
variants of this instruction, the 
ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER() does path resolution and in 
the case of first load opens the source stream.  Why?  Why not just 
leave examination of the source stream to the compile function?  LPC 
(like APC though for different reasons)  has to rehook the 
ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLE()  handler with a wrapper that 
does dynamic code modification to prevent this access occurring.  O+ 
adopts a different strategy.


LPC or APC having to walk through and essential patch over (private) 
static constant op_handler vectors is just plain horrible.  No, IMO the 
Zend architecture should be designed to support caching; it should 
present a proper clean interface that extensions such as O+ and APC 
implement; and the entire PHP test suite should be capable of being 
executed with a cache extension in place and a good cache should not 
introduce any test failures.


Sorry for the rant and I hope that I answered your Q :)




Re: [PHP-DEV] Zend Optimizer+ Source Code now available

2013-02-14 Thread Terry Ellison

On 14/02/13 18:24, Stas Malyshev wrote:

Are optimizations documented?

Not yet AFAIK.

No, but they are pretty self-explanatory.  O+ is a _Zend_ extension 
rather than a _PHP_ extension and this enables it to exploit extra 
hooks  (see the tail of ZendAccelerator.c) and specifically follow 
through  accel_op_array_handler() and the routines in the Optimizer 
subdirectory.  Essentially this hook is invoked as an epilogue to the 
generating of any op_array.  What this does is a number of peephole 
optimizisations to simplify and NOP out instruction sequences, and the 
last pass compresses the code removing dead NOPs to shrink the op_array 
-- this is typical of the sorts of things that the optimization passes 
of a compiler would do.


And this is a segue into one architectural issue the immediately struck 
me on scanning the code: surely there is a natural domain separation 
between compilation, image startup/rundown, and execution.  (i) is 
optimally done once per S/W version, (ii) per request, (iii) per 
instruction executed.  Surely O+ is currently a hybrid of (i) and (ii) 
and whilst this might have occurred for understandable historical 
reasons, I question this rationale going forward.


(A) The op-code optimization should be integrated into the core compiler 
and enabled through a GC(compiler_option) to be available to *any* 
opcode cache -- or to the application designer (by exposing these 
options through an INI directive.


(B) The O+ opcode cache itself is logically quite separate.  It makes 
great sense to keep ithis as a Zend extension (given the desire from 
some of the dev team to maintain a clear logical separation between the 
upper PHP environment and the Zend, Hippop, ... execution environments 
that support it).  A Zend opcode cache belongs firmly in the Zend world 
and shouldn't be a PHP extension.


I also note some interesting difference in approaches between O+ and 
APC, say for example:


1) in the handling of the ZEND_INCLUDE_OR_EVAL instruction where APC 
patches the op handler and O+ does peephole opcode examination. Both 
these workarounds could be avoided by tidying up the scope of work in 
the instruction code.


2) in the treatment of early binding class inheritence: APC include some 
reasonably complex logic to back this out; O+ sets a compiler option to 
disable this inheritance occurring in the first place, an approach that 
APC might want to copy.


Re: [PHP-DEV] double val to long val conversion issue

2013-02-10 Thread Terry Ellison

On 10/02/13 06:50, Stas Malyshev wrote:

isn't the case with visualC, and PHP internal data structures compiled
with visualC and gcc are significantly different; for example hash keys
are 32 bits long on Windows and 64bits on *nix.  Why aren't they 32bits,

Yes, they are different, because long size is different,
This my point: programmers from the *nix world tend to assume that longs 
are longer than ints.  In the MS world they are synonymous. This is one 
reason why code that has been developed in the *nix can be difficult to 
get working reliably in the MS world.

and PHP uses
long (more specific, ulong) to store hash values. This is because
numeric values are long,
I don't follow this reasoning.  Numeric value in this context is a PHP 
(application space) concept.  Hashes are internal to the Zend EE and are 
never exposed to the PHP programmer, so this is a case of comparing 
apples and pears.  All that having a 64-bit hash does (when used % 
nTableSize which smaller than maxint) is to add 8 bytes to every Bucket 
entry stored by the EE for no practical benefit.


(It's 8 bytes because ulong h; uint nKeyLength; long boundary takes 16 
bytes but uint32 h; uint nKeyLength; long boundary takes 8 because of 
data alignment.)



and it's easier to use the same type for both
than bother with converting back and forth. As you noted, the difference
for hashing is minimal since the value is anyway brought to hash size,
and hashes of size more than 32-bit LONG_MAX aren't very practical.
However, it matters in other parts of the code.




Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution

2013-02-09 Thread Terry Ellison

Following the discussion at the end of last week, I prepared a draft RFC
for the inclusion of Optimizer+ in PHP.
In parallel we’re in the process of prepping the source code for
independent public consumption, which I hope we can be done with by the end
of next week, hopefully sooner.




Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution

2013-02-09 Thread Terry Ellison

On 29/01/13 08:03, Zeev Suraski wrote:

Following the discussion at the end of last week, I prepared a draft RFC
for the inclusion of Optimizer+ in PHP.

In parallel we’re in the process of prepping the source code for
independent public consumption, which I hope we can be done with by the end
of next week, hopefully sooner.


It's great news that Zend Technologies has decided to open-source 
Optimizer+ and given that it is now the end of next week  I look 
forward to seeing on this code any day.  So thanks to you for this 
decision.  But now to specific comments on your RFC.


1.  Scope of the RFC.

IMO, the RFC covers four separate issues that would be easier to review, 
refine and agree if they were kept separate:


a.  Zend's decision to OS+. This is entirely within Zend Technology Inc. 
and outside the scope of any RFC.


b.  The establishment and proper architecture and support of an 
opcode-cache interface within the Zend Execution Engine (EE).  I will  
discuss this below.


c.  The decision to include Optimizer+ as a core extension within the 
PHP project.  However as at the time of this draft only Zend employees 
-- and selected Zend-approved 2nd parties who have signed the 
appropriate NDAs -- have access to the Optimizer+ source and are 
therefore able to review its content.   Surely such open access is a 
precondition, and it makes no sense to issue an RFC to inform this 
decision until at least a few months after the source has been made 
widely available for review.


d.  The project decision to give any specific opcode-cache extension a 
preferred status over the alternative opcode-caches.  Such a decision 
is going to be contentious and -- unless carefully, transparently and 
fairly managed -- could lead to conflict within the project.  Not good.  
So I would suggest that the RFC limit itself to non-contentious claims 
relating to one optimizer performance over another.


2.  The Detailed Content

The Introduction will need redrafting depending on the proposed / 
revised scope of this RFC.


Some form of definition / description of both a PHP opcode-cache and PHP 
data-cache needs including in the PHP wiki, but this would sit better 
under the https://wiki.php.net/internals hierarchy.  This RFC should 
simply wiki-link to this page on the first use of [[opcode cache]].


The  Interaction with other extensions and plugins section is surely  
a general statement of requirement that should apply to _any_ opcode 
cache and not just Optmizer+, so again this content belongs in separate 
Wiki a document with a wiki-link here.


The Alternatives is really a Comparison of APC and Optimizer+ and I 
suggest that some points are contentious.  The same point applies to the 
remaining sections.   Surely this sort of comparison only becomes 
necessary when we've reach a stage where we are asking voters to choose 
a preferred cache, and in that case wouldn't be more appropriate to 
agree the selection / assessment criteria first before carrying out a 
selection exercise?


3.  Why do I suggest an Opcode-Cache interface RFC?

The current Zend 2.x engines provide some hooks which enable the main 
opcode caches -- including Optimizer+ and APC -- to deliver accelerated 
performance for many application usecases.  However, some aspects of 
hooking an opcode cache into the Zend EE remain a somewhat of a 
compromise.  These include:


a.   The management of early vs. late binding and the work-arounds that 
opcode caches must do to back-out unwanted early binding.


b.   Some essential functions that the caches must hook into are not 
exposed as hooks (like zend_compile_file) and are sometime implemented 
using static functions, leading to the cache needing to reimplement 
chunks of zend code.


c.   There should be a clear scoping separation of what the (cached) 
compile does and what the EE does.  An example of where this is mixed is 
in the ZEND_INCLUDE_OR_EVAL_xxx_HANDLER functions which resolve paths 
and open source files in the case of the xxx_once functions.  This file 
access is usually unnecessary in the case of cached files as the op-code 
cache has already cached the relevant information.


Given that opcode caches are now core to PHP performance, it should be 
possible to implement a cache using hooks and interfaces exported 
through a Zend header file and without recoding bits of the engine. 
Optimizer+ should be an exemplar of such an approach.


Regards Terry Ellison


Re: [PHP-DEV] double val to long val conversion issue

2013-02-09 Thread Terry Ellison

On 09/02/13 15:47, Pierre Joye wrote:

hi Remi

On Sat, Feb 9, 2013 at 4:10 PM, Remi Collet r...@fedoraproject.org wrote:

About
http://git.php.net/?p=php-src.git;a=commitdiff;h=79956330fe17cfd5f60de456497541b21a89bddf
(For now, I have reverted this fix)

Here some explanations.

LONG_MAX is 9223372036854775807 (0x7fff)
double representation of LONG_MAX is 9223372036854775808

(d  LONG_MAX) is evaluated in double space.
So is false for double which have the same value than (double)LONG_MAX.

So, for (double)LONG_MAX the cast used is
 (long)d

9223372036854775807 on ppc64
9223372036854775808 on x86_64 (gcc without optimization)
9223372036854775807 on x86_64 (gcc -O2)

PHP expected value is 9223372036854775808
(Btw, I don't understand why PHP, build on x86_64, with -O2, gives the
good result, some environment mystery)

Obviously, we could have different result on different platform,
compiler, architecture.

I will be very interested by result on other platform (mac, windows),
compiler (Visual C), architecture.

If we switch to the unsigned cast:
 (long)(unsigned long)d;
Any comments ?

IIRC, on windows/visualC, no matter if it is x86 or x64, long is
always 32bits, so it won't change the size of long.
See http://en.wikipedia.org/wiki/LLP64#64-bit_data_models for a good 
description of this mess.  AFAIK many packages that target both 32 and 
64 bit environments MS and *nix, define explicitly adopt XXX_int32, 
XXX_uint32, XXX_int64, XXX_uint64, ... datatypes and use wrappers to map 
these onto the appropriate visualC / gcc types.  As far as I can see, 
PHP doesn't and seems to use long and int almost interchangeably which 
causes problems as LP64/I32LP64 and LLP64/IL32P64 are very different.  
This is one reason for 64-bit support on Windows being problematic.


It would be good for PHP to have a road map to removed data 
model-specific potholes, say by 5.6 or 5.7.


//Terry


Re: [PHP-DEV] double val to long val conversion issue

2013-02-09 Thread Terry Ellison

On 10/02/13 03:25, Stas Malyshev wrote:



these onto the appropriate visualC / gcc types.  As far as I can see,
PHP doesn't and seems to use long and int almost interchangeably which

PHP indeed does not use fixed-size types in zvals, etc. but it
definitely does not use long and int almost interchangeably. In almost
any place where int is used instead of long or vice versa (unless it is
a specific small value that is nowhere near limits of either int or long
and used in very restricted context) - it is a bug and should be fixed.
If you know of such places, please name them or even better, submit a
bug report pointing them out.
Stan, you are right to correct me.  Sorry.  However, I still feel that 
the implicit assumption is that sizeof(long) == 2*sizeof(int) and this 
isn't the case with visualC, and PHP internal data structures compiled 
with visualC and gcc are significantly different; for example hash keys 
are 32 bits long on Windows and 64bits on *nix.  Why aren't they 32bits, 
say, on both? (as there is no performance benefit is having 64bit hash 
keys when the maximum size of a hash table is an int).


[PHP-DEV] A quick intro for Terry Ellison and my extension work

2013-02-06 Thread Terry Ellison
I've posted a few times to this list and may do so some more, so as a 
courtesy I though that I should give a short intro on me and what I am 
doing here.  By way of a personal background, I retired early from HP 
five years ago due to ill-health though I have since recovered.  I was a 
Distinguished SE for what that was worth, and have been programming 
C/C++ and PHP for a LONG time.  Now I am a gentleman of leisure (AKA old 
fart) and now do it for pleasure only.


One thing about PHP that has always puzzled me was that we've never 
developed a good opcode caching solution for CGI and CLI use as the 
various opcode caches are add-on extensions rather than PHP core and 
really targeted for high-volume single UID environments.  I see two main 
problems with this:


1.  Multi-account service providers must employ UID based mandatory 
separation of processes, and no other shared approach meets even minimal 
security requirements.


2.  Whilst opcode caching only applies and an option for some PHP 
runtime environments and not all, it is not properly architected into 
the PHP compile and run environment, and hence the opcode caches -- by 
necessity -- seem to include quite a few clumsy workarounds to be able 
to run.  Opcode caching or at least a clear interface to such caches 
should be a PHP core feature for all execution modes.


What I've been doing for the last six months is to develop a Lightweight 
Program Cache (LPC) extension optimised for CGI and CLI use to get a 
better understanding of the problems and possible approaches to address 
these.  LPC started life as a fork of APC, but has turned into a 
stripped-down rewrite.  I see this primarily as a demonstrator and a 
vehicle for my really getting to grips with the internals of the PHP 
compiler and execution engine.  If anyone is interest or wants to get an 
overview of how PHP code caches work have a read of:


https://github.com/TerryE/php-extensions/blob/master/lpc/TECHNOTES.txt

I doubt that it will ever be anything more than beta code so I don't 
need a VCS account or any other karmas.  Nonetheless it does essentially 
work -- it roughly doubles the throughput of MediaWiki with no memory 
leaks.  However, it is a long way away from being at a level where I 
would suggest anyone pull the repository and plays with it.  I am still 
at Zend 2.3 and I need still need to roll in the extra 2.4 functional 
changes.  It fails the PHP test suite for some extensions, and I am 
currently tracking down a bunch of issues from the php5/Zend/test 
failures (yup -- one advantage of running in CLI mode is that I can test 
the extension against the entire PHP test suite in two passes -- one to 
build the per-script caches and a repeat to run against the cache version).


Anyway, regards to you all, Terry Ellison


Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()

2013-02-05 Thread Terry Ellison

On 04/02/13 10:57, Ángel González wrote:
snip

The memory will stop growing (on my machine) at ~2491584 bytes and the
loop is able to run forever,
creating each eval() furthermore uniqe ano-function's but not
endless-filling Zend-internal tables.


but this still leaves the function record itself in the
function_table hash so with a non-zero reference count and this
doesn't get DTORed until request shutdown

Not familar with the Zend-internals but just about so i was imaging
and expecting it.

That why i [still] also confused/wondering why in the 2nd example the
memory will not grow *endless*.
It seems that the function records in the function_table will be
DTORed (or similar cleaned up) before request-shutdown at some point...

Could this be the case?

As you are reassigning $ano_fnc, the old closure is being destructed.
Had you used create_function(), it wouldn't happen.
Now the question is, if it is correctly freeing the functions (and it is
good that it does so), why is it not doing it when they have different
lengths?
It's a bug. The Closure class DTOR does not delete the derefenced 
function from the CG(function_table).


If you did the eval at line 20 in say /tmp/xx.php then the 
INCLUDE_OR_EVAL instruction calls the Zend compiler with the args:

  (1) the source to be compiled and
  (2) the title /tmp/x.php(20) : eval()'d code

The compiler than gives the closure function a magic name:

\0{closure}/tmp/x.php(20) : eval()'d code0x

where 0x is the hex address of the function substring in the 
evaluated string.  The compiler uses a zend_hash_update to insert this 
into the CG(function_table).


What happens if you use a fixed length string replacing another string 
of the same length dropping its refcount to 0 is that the allocator is 
clever and will tend to reallocate the old one and hence the address of 
the string is the same and the address of the offset of the function 
substring is the same so it regenerates the same magic name -- pretty 
much as an accidental side-effect.  When this happens, it's this hash 
update function that calls the DTOR on any pre-existing function with 
this name.


I simply put  a breakpoint on the relevant line in the 
zend_do_begin_function_declaration() code and if you used a fixed offset 
into the same string you only got one {closure} entry.  If the 
allocation ended up randomizing the address, then the {closure} 
entries grew  until memory exhaustion.


As I said -- interesting.  Need to think about the consequences before I 
submit a bugrep.

Regards
Terry


Re: [PHP-DEV] (non)growing memory while creating anoymous functions via eval()

2013-02-04 Thread Terry Ellison

Hi Terry and all
thank you very much for your response.

The only thing that confused me about what you say that the second 
*doesn't* grow
Yes, about that i was [and am still :-)] also confused... why the 2nd 
one won't grow *non-stop*



so I checked and it does -- just the same as the first.

Right, it grows, but not non-stop as in the 1st one.

The memory will stop growing (on my machine) at ~2491584 bytes and the 
loop is able to run forever,
creating each eval() furthermore uniqe ano-function's but not 
endless-filling Zend-internal tables.


but this still leaves the function record itself in the 
function_table hash so with a non-zero reference count and this 
doesn't get DTORed until request shutdown
Not familar with the Zend-internals but just about so i was imaging 
and expecting it.


That why i [still] also confused/wondering why in the 2nd example the 
memory will not grow *endless*.
It seems that the function records in the function_table will be 
DTORed (or similar cleaned up) before request-shutdown at some point...


Could this be the case?

OK, Hans-Jürgen, this one has got me interested. I am developing a fork 
of APC optimized for cgi and cli use -- but that's a different topic -- 
though understanding the DTOR processes for compiler objects interests 
me because of this.  I'll go through the code and especially for Closure 
objects to understand why.   However thinking through this logically:


1.  The fact that the second does stop growing means that that the 
reassignment of the global $ano sets the RC of the previous closure 
object to zero triggering DTOR of the lamba function.


2.  There is something pathological about the first case which is 
frustrating garbage collection on the lambda function DTOR.


I replaced your inner loop by:

$len = $argv[1]  1 ? $argv[2] : mt_rand(1, $argv[2]);
$str = '. . bin2hex(fread($fp, $len)) . ';
if ($argv[1]  2) $str = function() {\$y = $str; };;
eval (\$x = $str;);
echo Mem usage: .memory_get_usage().\n;

to allow me to use arg1 to select one of the four test cases 0..3 and 
arg2 is the (max) string size, n say. This clearly shows the fact that 
the memory explosion only occurs if the string is allocated *inside* the 
lambda function.


4.  If you substitute n15 then memory growth rapidly stabilises for PHP 
5.3.17 0, but still explodes for n14


5.  In the case of PHP 5.4.6 a similar effect occurs except that 
explosion occurs at n11.


6.  The fact that 5.3 and 5.4 are different is notable -- however, the 
fact that 5.4 is still (eventually) stable for n12 means that this 
isn't a string interning issue.


Interesting.  Merits more research :-)


Re: [PHP-DEV] [RFC] Integrating Zend Optimizer+ into the PHP distribution

2013-01-31 Thread Terry Ellison

On 30/01/13 00:54, Rasmus Lerdorf wrote:

On 01/29/2013 04:47 PM, Stas Malyshev wrote:

Hi!


which shows the dreaded zend_optimizerplus.inherited_hack which mimics
APC's autofilter hack. I'd love to get rid of this particular bit of
confusion/code complexity on the integration.

Ohh, this one. IIRC that has to do with conditional definition of
classes and the fact that script may be compiled in one environment but
loaded in another, which may create difference in class tables,
especially combined with early binding for inherited classes. Getting
rid of it is not that easy until people stop writing code like:
if($foo) return;
class Foo extends Bar {}
which would work differently depending on if Bar is defined or not.

Yes, I am quite familiar with it since we had to handle this in APC too.
But I don't think getting rid of it is that hard. It obviously can't be
done in the opcode cache because by the time the compiler hands us the
op_array we have already lost the FETCH_CLASS opcode which we may or may
not need. We need to look at whether that MAKE_NOP() call in the
compiler is actually a worthwhile optimization in a future where most
people will be running an opcode cache by default.

This is one of the prime examples of making the compiler more opcode
cache friendly. Yes, it may be at the slight expense of non-opcode cache
performance, but with a bundled opcode cache implementation that should
be less of a worry.
+1.  This one makes no sense to me as it simply hoists the 
zend_do_inheritance() from runtime binding to compile-time, and this 
binding has to be backed out by any  opcode cache to work properly. It 
might save a few microseconds per class declaration in non-cached 
performance, but add factors more to cached performance.  Why do this?