This mail is quite long as I copied it from a conversation with Toon Verwaest. 
Toon and Camillo re-implemented hashed collections for Pinocchio 
(http://scg.unibe.ch/research/pinocchio). Their benchmarks have shown that the 
new implementation is significantly faster than the one in Pharo 1.1. The 
source code is available from www.squeaksource.com/p.html (they haven't set a 
license, but I'm pretty sure they'll release their code as MIT).

I post this here to find somebody knowledgeable in the area to take a look at 
the code and figure out whether this implementation would be interesting for 
Pharo...

Cheers,
Adrian


citing Toon:

We ran some benchmarks on our data structures (It's in comparison with and 
benchmarked in standard Pharo 1.1). We just add / remove integers within a 
certain range thus always perfectly distributed making it ideal for Pharo 
dictionaries. 

The numbers show how long a certain benchmark took on average in 20 runs of the 
benchmark and its standard deviation over those runs. A single benchmark run 
generally consists of many individual operations on the collection in question. 
For example:

benchIncludes
   1 to: set size * 2 do: [ :i|
       set includes: i ]

where the set has size 10000. Evaluating this once results in a single run. So 
the number at the top of the benchmark is the sum of all averages with the 
average stdev.

The results:

PDictionary: 15.8 +/-1.7
   Do                0.472 +/-0.079
   RemoveKey         0.010 +/-0.021
   AtPut             0.947 +/-0.02
   AtIfAbsentPut     1.64 +/-0.96
   AtPutExisting     0.198 +/-0.011
   KeysAndValuesDo   0.502 +/-0.09
   IncludesKey       0.365 +/-0.024
   AtPutNew          4.2 +/-1.3
   Includes          7.40 +/-0.33

Dictionary: 138.2 +/-4.7
   Do                0.830 +/-0.098
   RemoveKey         10.292 +/-0.077
   AtPut             0.957 +/-0.044
   AtIfAbsentPut     1.52 +/-0.95
   AtPutExisting     0.203 +/-0.011
   KeysAndValuesDo   0.850 +/-0.096
   IncludesKey       80.25 +/-0.25
   AtPutNew          3.5 +/-1.3
   Includes          39.8 +/-4.4

PSet: 1.767 +/-0.057
   Remove            0.008 +/-0.018
   Add               0.845 +/-0.039
   AddExisting       0.310 +/-0.021
   AddNew            0.240 +/-0.021
   Includes          0.365 +/-0.024

Set: 70.9 +/-1.2
   Remove            7.737 +/-0.051
   Add               0.755 +/-0.022
   AddExisting       0.305 +/-0.015
   AddNew            2.473 +/-0.03
   Includes          59.6 +/-1.2


Obviously the very slight differences have to be taken with a grain of salt, 
but whenever it's more than 10x faster it's quite unlikely that it's due to 
some random runtime variation that might totally change in another run.

An important thing for our datastructures is that they don't degrade since we 
don't have colliding hashes; but we didn't really test that yet (although it 
becomes obvious in some of the benchmark results such as removing of elements 
and testing presence of elements).

> Is there anything special that one would neet to consider when replacing the 
> old implementation in Pharo with yours?

What doesn't happen at the moment is shrinking dictionaries after they have 
grown significantly. I wouldn't know at what point to do so either; nor do I 
think Dictionary does that?

There is a parameter indicating how many elements can be added before it 
switches from a SmallDictionary version to a dictionary with buckets. This is  
set to 20 by default, and I have no idea if it makes sense to change it; I 
didn't really profile what the optimal number is. Maybe it makes sense to make 
that a constant in the code rather than an instance variable to save a bit of 
space and time ...

We have our own PHashedCollection subclassing from Collection; so the biggest 
part of the hierarchy is compatible with what Pharo has. Mostly while porting 
Pinocchio-specific stuff will have to be removed; mostly annotations related to 
Pinocchio Primitives and exporting directives. This is all straightforward 
though.

The current "ratio" of hashes to elements is set to 500%. So basically when you 
have 32 buckets it will only grow when it contains 160 elements (or key-value 
pairs), so 5 elements per bucket on average. This seems to not make it slower 
which is understandable since SmallDictionary is pretty fast for small amounts 
of elements; but this ratio might have to be tweaked depending on the kind of 
objects and is currently a parameter. The advantage of using 500% is that you 
have A LOT less garbage.

Oh, and our dictionaries do use a lot less garbage, since Pharo's dictionaries 
use association objects for each key-value pair. We don't, so we never generate 
garbage when you remove an object.
Our Sets on the other hand do use a bit more memory because of the bucket 
scheme. Sets don't use associations so we don't have an edge there :) Only 
performance-wise in that case.

The hashes are masked with the amount of buckets - 1, since the buckets are 
always a power of 2.
This implies that the lower bits of the hash are the most significant ones. 
This does not work together with the current identityHash / IdentityDictionary 
since the 12bit hash is shifted to the left by 18 (or so). This would make all 
the lower 18 bits 0 and all objects thus ending up in the first bucket.

That is a very important thing to note enough. However, I also think it's a bit 
strange to just bitshift the identityHash by 18. You don't win any significant 
numbers with it ... you have 12bit hashes; no way to improve that (without 
changing the image format). Since we overfill our dictionaries however this 
works quite ok with those bad hashes; and we'll only start wasting a bit of 
space after you have added 20480 elements to the dictionary (you have 4096 
different hashes with 12 bits, * 5).



_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to