Hi Nuno,

Late reply, but I'm glad the idea was able to be used. :-)


To the other hash people: While checking how Ilia made implode() faster in
5.2, I think I found another optimization that can be done.  In
zend_hash_copy/merge, it seems the zend_hash_quick_* functions can be used
instead for associative entries, thus saving the key from being hashed
again, right?  I just checked, and it's definitely faster, how much so being
dependent on the key length of course...

Patches for this simple additional optimization (hopefully):
http://realplain.com/php/hash_quick_copy.diff
http://realplain.com/php/hash_quick_copy_5_2.diff


Thanks,
Matt


----- Original Message -----
From: "Nuno Lopes"
Sent: Wednesday, September 27, 2006

> Aaahh nice catch ;)
>
> >From a quick lookup in PHP sources I also found these functions were the
> same optimization can be applied:
> zend_default_exception_new_ex() - trivial
> convert_to_array() - not as easy as others
> zend_object_std_init() - possibly
> clone_wrapper_hash() - trivial (php-src/main/streams/streams.c)
> compiler_globals_ctor()
> ...
>
>
> Nuno
>
>
> ----- Original Message ----- 
> > Hi all,
> >
> > I noticed awhile ago how most every use of zend[_u]_hash_init has nSize
as
> > 0.  Of course it isn't always known how many elements will be added to
the
> > array (nTableSize), but there are places where an accurate value could
be
> > used instead of getting the minimum default (8).  For anyone who doesn't
> > know, HashTable sizes are a power of 2, and when there are more than
that
> > many elements, zend_hash_do_resize realloc's more memory to double the
> > table
> > size, and then zend_hash_rehash basically goes through and "reinserts"
all
> > the elements again.
> >
> > I ran some tests to see the speed improvement from specifying the
*actual*
> > number of elements in hash_init, thus saving that extra work (_rehash
> > mostly).  The "n+1" is with one more element (9, 17, 33...) that
triggers
> > another rehash when the actual number wasn't specified (most extreme).
(I
> > used add_next_index_zval, so the direct zend_hash* functions would give
> > slightly higher % I guess, being less overhead, right?)  Results with
> > 5.2.0-RC4:
> >
> >          Elements
> >  n   |   n      n+1
> > ---------------------
> > 8     |    0%   14.2%
> > 16    | 11.0%   20.9%
> > 32    | 13.5%   22.1%
> > 64    | 12.6%   21.3%
> > 128   | 11.7%   18.5%
> > 256   |  9.3%   16.4%
> > 512   |  8.6%   17.0%
> > 1024  |  7.9%   15.8%
> > 2048  |  4.8%   33.3%
> > 4096  |  7.8%   28.3%
> > 8192  | 10.2%   58.4%
> > 16384 | 24.1%   70.5%
> > 32768 | 34.5%   80.4%
> > 65536 | 34.8%   68.6%
> >
> > I haven't looked thoroughly, but the only place I've seen that uses an
> > non-zero nSize in hash_init (besides some in the core engine) is the
> > unserialize() function. :-/  It seems the most simple and obvious place
> > that
> > could be changed is in Zend/zend_variables.c:_zval_copy_ctor_func?  Just
> > replace
> >
> > zend[_u]_hash_init(tmp_ht, 0, ...
> > with
> > zend[_u]_hash_init(tmp_ht, original_ht->nNumOfElements, ...
> >
> > Other than that, some of PHP's array functions and array-returning
> > functions
> > (database fetch row functions, etc.) are ones I thought of that could be
> > optimized like this.  Maybe create an array_init_size() macro to be used
> > in
> > places where the number of elements can be easily determined?  Thoughts?
> > I'd volunteer to look for places that could be updated and modify them.
> > :-)
> >
> >
> > Thanks,
> > Matt
> >
> > P.S.  I guess the array stuff applies for objects also?  But I don't
know
> > much about their internals...

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to