Greetings! This is just a note for posterity on some of the recent learnings regarding hashing in GCL. Except for symbol lookups in packages, gcl uses 'open addressing' hash tables, as is customary for a one word stored 'value', i.e. a lisp object pointer. With this strategy, collisions result in a linear probe of the table until either an empty slot or the desired key is found. The negative performance effects of the collision therefore occur not only for keys hashing to the same index, but for those hashing to *adjacent* positions. Using a cons address directly as an index will thus suffer, as conses are frequently allocated in bulk in a contiguous region of memory.
The current solution relies on our 'universal' hash function, i.e. a 256 wide table of random 64bit numbers indirected by bytes of the key and xored together, to not only distribute the hashed items uniformly throughout the table, but also to minimize the possibility of consecutive indices. I think these criteria should be fulfilled at present, though there are explicit algorithmic solutions to the consecutive index problem which do not impinge on the hash function. These require more memory and complexity, and seem unwarranted at present. What we were seeing with the recent performance degradation was a strong dependence on the data layout as represented by the addresses in use. As an historical aside we should note that GCL did randomize these addresses in the past, but it was found to be 'too good' in the sense that there were apparently ideal memory layouts which could be indexed faster if the addresses were used directly and the randomization skipped. I believe this discussion was in an ACL2 context some years ago. This is just to note that we are now 'reverting that reversion', and opting for a good average performance which (should be) independent of data layout. Take care, -- Camm Maguire address@hidden ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list Gcl-devel@gnu.org https://lists.gnu.org/mailman/listinfo/gcl-devel