[PATCHREF] Preliminary patch for hashtables

Steve Fink Thu, 25 Apr 2002 10:37:31 -0700

 - Renames KEY/KEY_PAIR to KEY/KEY_ATOM where KEY is now a linked list
   of KEY_ATOM
 - Reimplements hashes (based heavily on the implementation in key.c, but
   with the fancy multilevel stuff that I didn't understand removed.)
 - Creates a UnionVal type that is used in place of the PMC cache union
   and the KEY_PAIR's union.
 - Adds a string_from_num() function to string.c.


The patch is large (58KB), so I'm not going to include it here. In
fact, after I finish it up (still need to update the key PDD and fix
one bit of memory management), I'm just going to commit it unless
someone objects. The current patch is available at
<http://foxglove.dnsalias.org/~sfink/hash-patch.txt>, if anyone wants
to look at it in detail.

It passes the existing t/pmc/perlhash.t test. It is purely a
string-based hash, which matches perl5 but will be insufficient for
perl6. Should be good enough for things like symbol tables, though.

Things I'd like comments on:

 - UnionVal name. Is that ok, and is that the correct cAPitaLiZAtion?
 - do hashes need to support the sort of multilevel lookups that
   seemed to be in the original implementation? (I couldn't really
   follow it, so I'm not sure what it did. I finally understand why
   the KEY structure was so bizarre -- it seems to have been used for
   two very different purposes, and it was well-suited to hashtable use.)
 - how are references going to work? Like, say I have a reference to a
   hash's element. How does that work, and what restrictions does that
   place on doing things like resizing the hashtable? Will there be a
   particular reference type that is really just stored FETCH()
   parameters?
 - Damn, I can't remember the rest.

Known problems:
 - Hashes are never resized, even though they know how loaded they
   are. They just keep chaining on overflow buckets.
 - An empty hash takes up quite a bit of space.
 - Lightly tested
 - The interface exposes a little too much (eg the HASHBUCKET type).
   I'm leaving it for now, because I'm still pondering how to support
   'each %hash' and lvaluable 'values %hash'.

[PATCHREF] Preliminary patch for hashtables

Reply via email to