On Saturday, 24 June 2023 at 12:05:26 UTC, Jonathan M Davis wrote:
On Saturday, June 24, 2023 1:43:53 AM MDT Cecil Ward via
Digitalmars-d-learn wrote:
On Saturday, 24 June 2023 at 07:36:26 UTC, Cecil Ward wrote:
> [...]
I just realised something, your point about altering the table
and having to rehash, is well taken. I hadn’t considered that.
The reason for my foolishness in failing to realise that I’m
asking the impractical is my pattern of usage. I add all the
entries into the mapping table and have no interest in any
lookups until it is fully built. Then a second function starts
to do lookups while the data remains unchanging and that usage
pattern can be guaranteed. I could even idup it if that would
help, as copying < 32 uints wouldn’t take forever. A typical
value would be a mere 5 or less. I only picked 32 to be
completely safely ott.
Well, if the key were a struct or a class, the hashing function
would be opHash. For built-in types, the runtime has hashing
functions that it uses. Either way, with AAs, you really don't
worry about managing the memory, because it's completely
outside of your control. You just put the elements in there
using their associated keys, and if you want to try to speed it
up after you've populated it, you use rehash so that the
runtime can try to move the elements around within the
container so that lookup speeds will be closer to optimal.
As such, for the most part, when dealing with AAs and worrying
about efficiency, the question really becomes whether AAs are
the correct solution rather than much of anything having to do
with how you manage their memory.
With so few elements, it's also possible that using
std.algorithm.searching.find would be faster - e.g. having a
dynamic array of strings where the matching int is at the same
index in a dynamic array of ints - or you could use
std.typecons.Tuple!(string, int)[] with something like
arr.find!(a => a[0] == key)() to find the tuple with the int
you want.
Simply comparing a small number of strings like that might be
faster than what goes on with hashing the string and then
finding the corresponding element within the AA - or it might
not be. You'd have to test that to know. The AA would
definitely be faster with a large number of elements, but with
a small number of elements, the algorithmic complexity doesn't
really matter, and the extra overhad with the AA lookups could
actually mean that the search through the dynamic array is
faster even though it's O(n). But you can only know which is
faster by testing it out with the actual data that you're
dealing with.
Regardless, you need to remember that associative arrays are
not arrays in the C sense. Rather, they're hash tables, so they
function very differently from dynamic arrays, and the rehash
function is the closest that you're going to get to affecting
how the elements are laid out internally or how much memory the
AA is using.
- Jonathan M Davis
I started out looking into a number of runtime library routines,
but in the end it seemed quicker to roll my own code for a crude
recursive descent parser/lexer that parses part of D’s grammar
for expressions, and (again partial grammar) parser for string
literal expressions and so on. I find certain special elements
and execute actions which involve doing the AA lookup and
replacing variable names with ordinal numbers in decimal in the
output stream. Admission: The parsing is the thing that has to be
fast, even though again the size of the D language text is not
likely to be huge at all. But 40 years ago, I came from a world
with 2k RAM and 0.9 MHz clock rates so I have developed a habit
of always thinking about speed before I do anything, needful or
not, to be honest. I once wrote a program that took 35 mins to
evaluate 2+2 and print out the answer, so I’m now ashamed of
writing slow code. Those were bad days, to be honest. 4 GHz+ and
ILP is nicer.