Michael Rynn wrote:
On Mon, 22 Mar 2010 00:52:24 -0700, Walter Bright wrote:

bearophile wrote:
A way to know the usage patterns of D AAs is to add instrumentation in
nonrelease mode, they can save few statistical data once in a while in
the install directory of D, and then the user can give this little txt
file to Walter to tune the language :-)
There's no need to tune the language. The implementation and data
structure of the AAs is completely opaque to the language. The
implementation is in aaA.d. Feel free to try different implementations
in it!

Well, I have.

See the code here : http://www.dsource.org/projects/aa/browser/trunk/
druntime/aaA.d

See the benchmarks and comments here : http://www.dsource.org/projects/aa/
wiki/DrunTimeAA. The result squeezes some extra performance for integer or less sized keys (about 20% faster for lookups).

Insertions and cleanups are faster too.

It's a nice implementation. There are some problems with it, though:

1. Deleted entries cannot be returned to the garbage collector pool unless the entire hash table is removed.

2. All the keys are stored in a single array, as are all the values. This requires that a contiguous chunk of memory can be found for the arrays when rehashing to a larger hashmap. This could make it very difficult for this to work when hash tables grow to be of significant size relative to all available memory. Note that the current implementation has a bucket array that has as many entries as the number of elements in the hashtable, but the element size is only a pointer, not the arbitrary size of a key or value.

3. Rehashing (required as the hashmap grows) means that the key/value arrays must be reallocated and copied. That's fine for the keys, but for the values this is a huge problem because the user may have dangling pointers into the values. (This is why Andrei started the thread entitled "An important potential change to the language: transitory ref".) I think this is currently the killer problem for randAA's approach.

Reply via email to