On Nov 10, 2008, at 5:24 PM, shire wrote:
It sounds like this would only work if the array contents where
static though, as you're mapping a constant string to the
contents of the hash (or did I misunderstand and you'd be
mapping string const. values to hash IDs?).
My point is, replacing this process:
$a['foo'] or $a->foo -> compute hash of 'foo' -> find item for
hash 'foo' -> many items? -> resolve conflict -> return item
With this process:
$a[% string_literal_id_5 %] -> lookup item key 5 for array ->
return item
Notice we skipped hash generation, and conflict resolution
altogether. We only have the lookup for the integer id.
If some additional work is done, even this lookup can be
eliminated and make this an O(1) process.
If instead the coder used variable:
$a[$bar] or $a->$foo (var array lookup and var var object lookup),
then this optimization can't kick in, and the existing algorithm
will be used.
However "static" access is the predominant usage, especially for
objects, but also for arrays, so this should have significant impact.
Thanks for the clarification, this is pretty much the same idea as
what I've been interested in working on next. I think I was more
inclined to store an extra hash value within the zvals themselves,
with the hope that this could be expanded to non-constant values.
I believe ruby implements it's lookups this way (noted just for
reference, not as an argument to copy another language ;-) ). Any
thoughts on reasons not to do this (other than increasing the size
of zval struct), it's pretty simple to implement this for static
values I believe, dynamic values are a lot more difficult obviously...
Since nobody else has chimed in with the obvious (to me, anyways):
I've worked with some code that uses disgustingly huge (>512Mb)
arrays, largest implementation was a single 2.5 Gb array (before we
took the offending programmer into a room and had a... chat).
I'd be interested in seeing some metrics on the needed extra CPU
ticks for determining if an array (or array sub-element) is static or
dynamic under the scheme, as well as the extra memory for storing an
(many?) extra value(s). It sounds like it might be totally livable,
if done right... done wrong, we could be looking at millions of CPU
hits for checking millions of single element static arrays.... (and
yes, storing millions of values as single element arrays is "doing it
wrong", but I've learned not to underestimate the creativity of
people who write software).
Oh, and while we're at it, what about "re-assigning" "static" arrays?
The idea sounds good, the corner-cases on mis-implementations are
where it always becomes amusing.
-Bop
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php