I've updated the patch with few use cases 
https://gist.github.com/dstogov/429fcc2ba051fdcf774a310c5d6db00d

The patch doesn't show any visible speed difference, but in term of "CPU 
instructions" (measured by callgrind) it makes 0.3% regression on 100 requests 
to Wordpress home page.

This is caused by keeping HASH_FLAG_STRING_KEYS/HASH_FLAG_LONG_KEYS consistency.


So, I'm not so optimistic about this now...


Please, think if we need this, and if it's possible to make this better.


Thanks. Dmitry.

________________________________
From: Benjamin Coutu <ben.co...@zeyos.com>
Sent: Wednesday, October 19, 2016 5:31:01 PM
To: Dmitry Stogov; Xinchen Hui; Nikita Popov; Joe Watkins
Cc: PHP Internals
Subject: Re: [PHP-DEV] Exploit fully packed array/hash property

Hi Dmitry,

there is a typo in line 78 of your patch 
(https://gist.github.com/dstogov/429fcc2ba051fdcf774a310c5d6db00d#file-ht_flags-01-diff-L78).
 It should be HT_IS_PACKED(ht) instead of HT_IS_PACKED(HT).

Tahnks,
Ben

========== Original ==========
From: Dmitry Stogov <dmi...@zend.com>
To: Benjamin Coutu <ben.co...@zeyos.com>, Xinchen Hui <xinche...@zend.com>, 
Nikita Popov <nikita....@gmail.com>, Joe Watkins <pthre...@pthreads.org>
Date: Wed, 19 Oct 2016 14:53:31 +0200
Subject: Re: [PHP-DEV] Exploit fully packed array/hash property


The main API/BC changes implementation: 
https://gist.github.com/dstogov/429fcc2ba051fdcf774a310c5d6db00d

All tests passed. Performance is not affected (+1 CPU instruction on each *new* 
element insertion)

If it's OK and allowed, after committing this, I'll add few usages of these new 
defines for optimization.

All of them are going to be self-containing changes to particular functions 
implementation.


Thanks. Dmitry.

________________________________
From: Dmitry Stogov <dmi...@zend.com>
Sent: Wednesday, October 19, 2016 2:20:12 PM
To: Benjamin Coutu; Xinchen Hui; Nikita Popov; Joe Watkins
Cc: PHP Internals
Subject: Re: [PHP-DEV] Exploit fully packed array/hash property

Hi Benjamin,


I think this is great idea!

Let me check that can we get from this, and if we may add this into PHP-7.1 (it 
may be to late).


Thanks. Dmitry.

________________________________
From: Benjamin Coutu <ben.co...@zeyos.com>
Sent: Wednesday, October 19, 2016 1:45:00 PM
To: Xinchen Hui; Dmitry Stogov; Nikita Popov
Cc: PHP Internals
Subject: [PHP-DEV] Exploit fully packed array/hash property

Hello everyone,

I've identified a few more array/hash use cases where it might make sense to 
introduce special short circuit logic for packed arrays.

Specifically, there is an additional property of certain packed arrays (apart 
from being packed obviously) that we can utilize: A packed array with nNumUsed 
equal to nNumOfElements must be consecutively indexed from zero onward without 
any gaps (no IS_UNDEF). Let's call this special pure form of a packed array a 
"fully packed array".

Earlier I have posted on how this convenient property could speed things up for 
array_slice, even if preserved_keys=true (for the offset=0 case), see 
https://marc.info/?l=php-internals&m=147048569215717&w=2

I therefore propose to introduce the following two new macros in 
/Zend/zend_types.h in order to make it easier for developers to introduce 
special packed (or fully packed) array logic:

#define HT_IS_PACKED(ht) ((ht)->u.flags & HASH_FLAG_PACKED)
#define HT_IS_FULLY_PACKED(ht) (HT_IS_PACKED(HT) && (ht)->nNumUsed == 
(ht)->nNumOfElements)

With this we can easily speed things up (and more importantly preserve the 
packed array characteristics) for the common case of array_slice($packed_array, 
$offset=0, $length, $preserved_keys=true) by using the follwing snippet for 
https://github.com/php/php-src/blob/master/ext/standard/array.c#L2906 :

if (preserve_keys ? (offset == 0 && HT_IS_FULLY_PACKED(Z_ARRVAL_P(input))) : 
HT_IS_PACKED(Z_ARRVAL_P(input)))
  ... ZEND_HASH_FILL_PACKED ...

Another example where this could come in handy involves the encoding of JSON. 
For every array that it encodes, json_encode must first detemine wheter the 
array contains string keys or is not consecutively indexed, so that it can 
decide wheter to use a JSON object or a plain JSON array. That is accomplished 
through php_json_determine_array_type in /ext/json/json_encoder.c. That code 
currently has to iterate through the array until it finds a string key or a 
non-consecutive index (which in the worst case would mean iterating through the 
entire array).

Now, for fully packed arrays we can short circuit things and jump directly to 
the conclusion of it being a real plain old array, if it is packed and we can 
prove it has no gaps. That can of course easily be done via the new 
HT_IS_FULLY_PACKED macro. We can improve this code for a large amount of arrays 
with the following simple snippet for 
https://github.com/php/php-src/blob/master/ext/json/json_encoder.c#L38 :

if (HT_IS_FULLY_PACKED(myht))
  return PHP_JSON_OUTPUT_ARRAY;

I'll be happy to make an effort to screen the entire code base in search for 
more cases where this could be useful once this is picked up by a lead core 
developer (maybe Xinchen?) who is willing to commit something on the lines of 
the above.

Any thoughts?

Benjamin

--

Bejamin Coutu
ben.co...@zeyos.com

ZeyOS, Inc.
http://www.zeyos.com

Reply via email to