On 7/25/2013 1:00 PM, bearophile wrote:
It's not the hashing that's slow. It's the lookup that is.
By "different hashing scheme" I meant different strategies in resolving hash
collisions, likes double hashing, internal hashing, cuckoo hashing, and so on
and on. Maybe one of such alternative strategies is more fit for the needs of
dmd compilation. (I think that currently the Python dicts are using a hashing
strategy different from the built-in dictionaries of D. The Python style of
hashing was implemented in D some months ago, but I don't remember what happened
to that project later).
Hash collisions are not the problem - I sized the hash bucket array to make it
fairly sparse. Neither is the hash algorithm.
The slowness was in the frackin' "convert the hash to an index in the bucket",
which is a modulus operation.
Also, computing the hash is done exactly once, in the lexer. Thereafter, all
identifiers are known only by their handles, which are (not coincidentally) the
pointer to the identifier, and by its very nature is unique.