Re: [PHP-DEV] [RFC] Performance improvements

Dmitry Stogov Tue, 13 Apr 2010 08:31:27 -0700

Hi Richard,

Richard Quadling wrote:

On 13 April 2010 14:53, Dmitry Stogov <dmi...@zend.com> wrote:

Hi,


I've published all the patches, their description and performance evaluation
at http://wiki.php.net/rfc/performanceimprovements

In two words the patches give 0-20% improvement even on real-life
applications.

I'm going to commit them into trunk in a week in case of no objections.

Of course, they are binary incompatible. Some extensions (especially VM
depended e.g. APC, xdebug, etc) will have to be modified to support the
changes.

Thanks. Dmitry.

Zeev Suraski wrote:

Hi,

Over the last few weeks we've been working on several ideas we had for
performance enhancements. We've managed to make some good progress. Our
initial tests show roughly 10% speed improvement on real world apps. On
pure OO code we're seeing as much as 25% improvement (!)

While this still is a work in progress (and not production quality code
yet) we want to get feedback sooner rather than later. The diff (available
at http://bit.ly/aDPTmv) applies cleanly to trunk. We'd be happy for people
to try it out and send comments.

What does it contain?

1) Constant operands have been moved from being embedded within the
opcodes into a separate literal table. In additional to the zval it contains
pre-calculated hash values for string literals. As result PHP uses less
memory and doesn't have to recalculate hash values for constants at
run-time.

2) Lazy HashTable buckets allocation – we now only allocate the buckets
array when we actually insert data into the hash for the first time. This
saves both memory and time as many hash tables do not have any data in them.

3) Interned strings (see
<http://en.wikipedia.org/wiki/String_interning>http://en.wikipedia.org/wiki/String_interning).
Most strings known at compile-time are allocated in a single copy with
some additional information (pre-calculated hash value, etc.). We try to
make most incarnations of a given string point to that same single version,
allowing us to save memory, but more importantly - run comparisons by
comparing pointers instead of comparing strings and avoid redundant hash
value calculations.

A couple of notes:
a. Not all of the strings are interned - which means that if a pointer
comparison fails, we still go through a string comparison; But if it
succeeds - it's good enough.
b. We'd need to add support for this in the bytecode caches. We'd be
happy to work with the various bytecode cache teams to guide how to
implement support so that you do not have to intern on each request.

To get a better feel for what interning actually does, consider the
following examples:

// Lookup for $arr will not calculate a hash value, and will only require
a pointer comparison in most cases
// Lookup for "foo" in $arr will not calculate a hash value, and will only
require a pointer comparison
// The string "foo" will not have to be allocated as a key in the Bucket
// "blah" when assigned doesn't have to be duplicated
$arr[“foo”] = “blah”;

$a = “b”;
if ($a == “b”) { // pointer comparison only
...
}

Comments welcome!

Zeev

Patch available at: http://bit.ly/aDPTmv

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Firstly, I'm a total novice in terms of C code.

My comment relates to the following line in the wiki ...

"On the other hand it'll have to check if the HashTable was
initialized on each new element insertion."

So, in PHP code, I assume it looks something like ...

<?php
$Hash = null;

function addToHash(&$Hash, $Value) {
// Is the hash ready?
if (is_null($Hash)) {
 $Hash = array();
}

// Add the value.
$Hash[] = $Value;
}

addToHash($Hash, 'First');
addToHash($Hash, 'Second');
?>

Each time a value is added, the hash is checked to see if it is ready
to receive the value.

That makes sense.

Rather than testing to see if the hash is ready, is it
possible/appropriate/feasible to have something like this ...

A function reference/pointer which points to function2 at compile time
- I suspect each hash would need this.
function1 which adds to the hash.
function2 which will initialize the hash, call function1to add the
value and then replace the reference to function2 to function1.

That way, there is no additional testing. Essentially the start point
is a wrapped function1 which gets unwrapped on the first use.
Thereafter the unwrapped version is always used.

I use this sort of thing in PHP with closures and in JavaScript. The
common use is to run some code based upon a value (say like a switch
statement). Once known, running the switch statement each time is
redundant.

Is something like this feasible?

As I already answered in different email on most moder CPUs indirectcall is more expensive than comparison and conditional jump. Even in PHPindirect call must be slower than if (isset($arr)) {} else {}


Thanks. Dmitry.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC] Performance improvements

Reply via email to