This mail is quite long as I copied it from a conversation with Toon Verwaest.
Toon and Camillo re-implemented hashed collections for Pinocchio
(http://scg.unibe.ch/research/pinocchio). Their benchmarks have shown that the
new implementation is significantly faster than the one in Pharo 1.1. The
source code is available from www.squeaksource.com/p.html (they haven't set a
license, but I'm pretty sure they'll release their code as MIT).
I post this here to find somebody knowledgeable in the area to take a look at
the code and figure out whether this implementation would be interesting for
Pharo...
Cheers,
Adrian
citing Toon:
We ran some benchmarks on our data structures (It's in comparison with and
benchmarked in standard Pharo 1.1). We just add / remove integers within a
certain range thus always perfectly distributed making it ideal for Pharo
dictionaries.
The numbers show how long a certain benchmark took on average in 20 runs of the
benchmark and its standard deviation over those runs. A single benchmark run
generally consists of many individual operations on the collection in question.
For example:
benchIncludes
1 to: set size * 2 do: [ :i|
set includes: i ]
where the set has size 10000. Evaluating this once results in a single run. So
the number at the top of the benchmark is the sum of all averages with the
average stdev.
The results:
PDictionary: 15.8 +/-1.7
Do 0.472 +/-0.079
RemoveKey 0.010 +/-0.021
AtPut 0.947 +/-0.02
AtIfAbsentPut 1.64 +/-0.96
AtPutExisting 0.198 +/-0.011
KeysAndValuesDo 0.502 +/-0.09
IncludesKey 0.365 +/-0.024
AtPutNew 4.2 +/-1.3
Includes 7.40 +/-0.33
Dictionary: 138.2 +/-4.7
Do 0.830 +/-0.098
RemoveKey 10.292 +/-0.077
AtPut 0.957 +/-0.044
AtIfAbsentPut 1.52 +/-0.95
AtPutExisting 0.203 +/-0.011
KeysAndValuesDo 0.850 +/-0.096
IncludesKey 80.25 +/-0.25
AtPutNew 3.5 +/-1.3
Includes 39.8 +/-4.4
PSet: 1.767 +/-0.057
Remove 0.008 +/-0.018
Add 0.845 +/-0.039
AddExisting 0.310 +/-0.021
AddNew 0.240 +/-0.021
Includes 0.365 +/-0.024
Set: 70.9 +/-1.2
Remove 7.737 +/-0.051
Add 0.755 +/-0.022
AddExisting 0.305 +/-0.015
AddNew 2.473 +/-0.03
Includes 59.6 +/-1.2
Obviously the very slight differences have to be taken with a grain of salt,
but whenever it's more than 10x faster it's quite unlikely that it's due to
some random runtime variation that might totally change in another run.
An important thing for our datastructures is that they don't degrade since we
don't have colliding hashes; but we didn't really test that yet (although it
becomes obvious in some of the benchmark results such as removing of elements
and testing presence of elements).
> Is there anything special that one would neet to consider when replacing the
> old implementation in Pharo with yours?
What doesn't happen at the moment is shrinking dictionaries after they have
grown significantly. I wouldn't know at what point to do so either; nor do I
think Dictionary does that?
There is a parameter indicating how many elements can be added before it
switches from a SmallDictionary version to a dictionary with buckets. This is
set to 20 by default, and I have no idea if it makes sense to change it; I
didn't really profile what the optimal number is. Maybe it makes sense to make
that a constant in the code rather than an instance variable to save a bit of
space and time ...
We have our own PHashedCollection subclassing from Collection; so the biggest
part of the hierarchy is compatible with what Pharo has. Mostly while porting
Pinocchio-specific stuff will have to be removed; mostly annotations related to
Pinocchio Primitives and exporting directives. This is all straightforward
though.
The current "ratio" of hashes to elements is set to 500%. So basically when you
have 32 buckets it will only grow when it contains 160 elements (or key-value
pairs), so 5 elements per bucket on average. This seems to not make it slower
which is understandable since SmallDictionary is pretty fast for small amounts
of elements; but this ratio might have to be tweaked depending on the kind of
objects and is currently a parameter. The advantage of using 500% is that you
have A LOT less garbage.
Oh, and our dictionaries do use a lot less garbage, since Pharo's dictionaries
use association objects for each key-value pair. We don't, so we never generate
garbage when you remove an object.
Our Sets on the other hand do use a bit more memory because of the bucket
scheme. Sets don't use associations so we don't have an edge there :) Only
performance-wise in that case.
The hashes are masked with the amount of buckets - 1, since the buckets are
always a power of 2.
This implies that the lower bits of the hash are the most significant ones.
This does not work together with the current identityHash / IdentityDictionary
since the 12bit hash is shifted to the left by 18 (or so). This would make all
the lower 18 bits 0 and all objects thus ending up in the first bucket.
That is a very important thing to note enough. However, I also think it's a bit
strange to just bitshift the identityHash by 18. You don't win any significant
numbers with it ... you have 12bit hashes; no way to improve that (without
changing the image format). Since we overfill our dictionaries however this
works quite ok with those bad hashes; and we'll only start wasting a bit of
space after you have added 20480 elements to the dictionary (you have 4096
different hashes with 12 bits, * 5).
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project