REMOVE

----- Original Message -----
From: Marcin Babij [mailto:marcin.ba...@nasza-klasa.pl]
To: internals@lists.php.net,apc-...@lists.php.net
Sent: Mon, 31 Jan 2011 12:35:53 +0100
Subject: Re: [PHP-DEV] Re: Zend engine's hashtable performance tweaks

Hello,
I've managed to change my patch to merge against trunk, there were some
problems with interned strings optimization. I've created bug report with
new patch:
http://bugs.php.net/bug.php?id=53866

I've changed hashing function again, it looks like working better than
previous, but nothings proven.

What's to be done (this is listed also in bug report):
-
http://lxr.php.net/xref/PHP_TRUNK/ext/standard/html_tables/html_table_gen.php#722
should be changed and html_tables.h regenerated, but this will need to
rewrite hashtable engine from C to PHP,
- APC should be fixed.


> Hi Marcin,
>
> Can you mail the PHP Internals list and also the apc-...@lists.php.net
> list when you have logged the bug?  I mentioned your patches to Gopal,
> the APC maintainer, and he was interested to see what was needed.
>
> Regards,
>
> Chris
>
>
> On 01/11/2011 04:28 PM, Marcin Babij wrote:
>> Hello,
>> I merge patch with trunk, do some things already proposed, and I'll
>> create bug report with patch for trunk in 1, 2 weeks most.
>>
>>>
>>>
>>> On 12/31/2010 10:15 AM, Pierre Joye wrote:
>>>> hi,
>>>>
>>>> Thanks for the patches :)
>>>>
>>>> Can you open a bug report please (and attach the patches to it)? I'm
>>>> sure this patch will be updated a couple of times before it reaches
>>>> the repository.
>>>>
>>>> Cheers,
>>>
>>> Hi Marcin,
>>>
>>> Did you log a bug for this?
>>>
>>> Regards,
>>>
>>> Chris
>>>
>>>>
>>>> On Fri, Dec 31, 2010 at 4:58 PM, Marcin Babij
>>>> <marcin.ba...@nasza-klasa.pl> wrote:
>>>>> Sorry for no attachments in previous message, I think my attachments
>>>>> weren't
>>>>> redirected with message by lists.php.net email confirmation system.
>>>>> I send
>>>>> them again, and for sure I attach links to public copy of them over
>>>>> HTTP:
>>>>> https://gist.github.com/761094 -
>>>>> php-5.3.4-hashtable-optimization.patch
>>>>> https://gist.github.com/761096 -
>>>>> apc-3.1.6-hashtable-optimization.patch
>>>>>
>>>>>> Hello,
>>>>>> I work for social network company, where we were running
>>>>>> optimization
>>>>>> project. One of it's results is patch to Zend engine's Hashtable,
>>>>>> which we
>>>>>> want to share and ask you for comments and improvements.
>>>>>>
>>>>>> Why we do this?
>>>>>> We run profiling on our production servers and found out that
>>>>>> zend_hash_*
>>>>>> functions take 10-20% CPU time of request. So there is some room
>>>>>> for easy
>>>>>> improvements.
>>>>>>
>>>>>> What was done?
>>>>>> - Hash function in zend_hash.h was rebuilt and became much faster,
>>>>>> without
>>>>>> losing the most important properties.
>>>>>> - Hashtable implementation was changed from Simple chaining to Open
>>>>>> addressing with linear probing, but with linked bucket, not
>>>>>> included in
>>>>>> hash array, which causes:
>>>>>> -- Bucket structure to lose 2 pointers.
>>>>>> -- Searching works similar, but don't have to jump with pointers
>>>>>> stored in
>>>>>> different memory locations, inserting, deleting and rehashing don't
>>>>>> need
>>>>>> to update linked list, but must search for first empty bucket,
>>>>>> which is
>>>>>> fast, because it scans continuous memory.
>>>>>> -- Load factor decreases from 1.0 to 0.5-0.75 to make less
>>>>>> collisions and
>>>>>> faster hashtable, which in turn increases memory footprint a little.
>>>>>> - Open addressing doesn't change significantly performance, but
>>>>>> next thing
>>>>>> was to create new array (arEmpty), which is of size nTableSize
>>>>>> bytes,
>>>>>> which keeps track of used/empty buckets and makes inserting and
>>>>>> rehashing
>>>>>> much faster. In future it can be tested as bit-array with size of
>>>>>> nTableSize/8 bytes.
>>>>>> - More macros were added to replace repetitive constructs.
>>>>>> - New constants were added to allow:
>>>>>> -- Creating new hashtables of size at least X (where 4 and 8 are
>>>>>> reasonable), which makes no rehashing and reallocing memory while
>>>>>> changing
>>>>>> size to 2 and then to 4.
>>>>>> -- For small tables it's better to extend them by a factor of 4
>>>>>> times, not
>>>>>> 2, to make rehashing cost smaller for most hashtables, of cost of
>>>>>> little
>>>>>> higher memory consumption.
>>>>>> -- For large tables it's better to have other load factor, closer
>>>>>> to 1,
>>>>>> while for small tables it's better to use load factor closer to 0.5.
>>>>>> - APC was patched to take changes in Bucket structure into account.
>>>>>>
>>>>>> How was it tested?
>>>>>> It was tested with make test, where one more (comparing to original
>>>>>> sources) test fails, but it's most probably because
>>>>>> http://bugs.php.net/bug.php?id=48858 - IMO test is badly
>>>>>> constructed (is
>>>>>> too simple) and any change of hashing function makes it fail. Also
>>>>>> it was
>>>>>> tested on our testing environment and production servers
>>>>>> against>30mln
>>>>>> requests to our site, with 120requests/s at peak on Xeon @ 2.50GHz
>>>>>> with
>>>>>> 8GB RAM running Debian Linux.
>>>>>>
>>>>>> What is the gain?
>>>>>> After tests CPU usage dropped by about 4% -6%.
>>>>>> Memory footprint goes up just by few percent.
>>>>>>
>>>>>> What can be done in future?
>>>>>> - Make new constants configurable by php.ini.
>>>>>> - Test if changing arEmpty from byte-array to bit-array helps on
>>>>>> performance.
>>>>>> - Tweak default constants' values using some real-live benchmarks.
>>>>>> - Prove (or modify and prove) hash function to have property, that
>>>>>> it has
>>>>>> no collisions if two keys don't differ on no more than 6 bytes,
>>>>>> which will
>>>>>> lead to memcmp omit first (or last) 6 bytes of key. Also simpler
>>>>>> thing may
>>>>>> be proven, that is it has no collisions if two keys are not longer
>>>>>> than 6
>>>>>> bytes, which will make most string keys omit memcpy at all.
>>>>>>
>>>>>> The patch was created and tested against php-5.3.0, apc-3.1.3p1,
>>>>>> then
>>>>>> merged with php-5.3.4, apc-3.1.6 without conflicts, and for these
>>>>>> last
>>>>>> versions patches are attached. Also, it shouldn't conflict with
>>>>>> http://wiki.php.net/rfc/performanceimprovements .
>>>>>
>>>>> --
>>>>> PHP Internals - PHP Runtime Development Mailing List
>>>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
> --
> Email: christopher.jo...@oracle.com
> Tel:  +1 650 506 8630
> Blog:  http://blogs.oracle.com/opal/

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to