- Renames KEY/KEY_PAIR to KEY/KEY_ATOM where KEY is now a linked list of KEY_ATOM - Reimplements hashes (based heavily on the implementation in key.c, but with the fancy multilevel stuff that I didn't understand removed.) - Creates a UnionVal type that is used in place of the PMC cache union and the KEY_PAIR's union. - Adds a string_from_num() function to string.c.
The patch is large (58KB), so I'm not going to include it here. In fact, after I finish it up (still need to update the key PDD and fix one bit of memory management), I'm just going to commit it unless someone objects. The current patch is available at <http://foxglove.dnsalias.org/~sfink/hash-patch.txt>, if anyone wants to look at it in detail. It passes the existing t/pmc/perlhash.t test. It is purely a string-based hash, which matches perl5 but will be insufficient for perl6. Should be good enough for things like symbol tables, though. Things I'd like comments on: - UnionVal name. Is that ok, and is that the correct cAPitaLiZAtion? - do hashes need to support the sort of multilevel lookups that seemed to be in the original implementation? (I couldn't really follow it, so I'm not sure what it did. I finally understand why the KEY structure was so bizarre -- it seems to have been used for two very different purposes, and it was well-suited to hashtable use.) - how are references going to work? Like, say I have a reference to a hash's element. How does that work, and what restrictions does that place on doing things like resizing the hashtable? Will there be a particular reference type that is really just stored FETCH() parameters? - Damn, I can't remember the rest. Known problems: - Hashes are never resized, even though they know how loaded they are. They just keep chaining on overflow buckets. - An empty hash takes up quite a bit of space. - Lightly tested - The interface exposes a little too much (eg the HASHBUCKET type). I'm leaving it for now, because I'm still pondering how to support 'each %hash' and lvaluable 'values %hash'.