Watch out for cancellation effects... if the collections are 
sequenceable and the hash values of their contents are not well 
distributed, you may end up with performance issues because the hash 
values cancel each other because the hash function commutes.  Consider 
for example the hash value of the following collections:

#(1 2 3)
#(1 2 3 4 4)
#(1 2 3 5 5)
#(1 4 2 3 4)
#(5 4 3 2 1 4 5)

#(6)
#(1 2 3 4 5 6 5 4 3 2 1)

I am not familiar with your application, though, so perhaps the data 
does not cause these problems.

Andres.


Ralph Boland wrote:
> Food for thought:
>
> In my finite state machine generator I do a lot of
> hashing of sometimes large collections.
> For performance reasons I created hash value storing versions
> of the relevant collection classes.  Getting the hash value
> of the hashed version of a collection was then very fast
> (accessing an instance value) and the hash values were good.
>
> For example, for Array I created HashArray  The hash
> value of a HashArray was the XOR of the hash values of
> the elements of the HashArray.  Note that this meant
> that each time you changed the entry in a HashArray
> you had to update the hash value.
> So for HashArray    at: index put: object    would look something
> like:
>
>      hash := (hash xor: object hash) xor: (self at: index).
>      super at: index put: object.
>
> If you like living on the edge you could instead leave out updating
> the hash value until you put the HashArray object where its hash
> value would be needed (like a Set or Dictionary (key)).
> This is of course error prone but then
> putting collections in Sets or Dictionaries (keys)
> is inherently error prone anyway because modifying the collection
> changes its hash value thus breaking every Set or Dictionary
> containing the collection.
>
> I would be temped to have all real collection classes (e.g. not Interval)
> not be storable in Sets or Dictionaries (key) at all and instead require
> hash versions of the collection be used but of course this would
> surely break a lot of things.
>
> Ralph Boland
>
>
>   

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to