On Mar 15, 2007, at 19:24 , Mike C. Fletcher wrote:
Dan Williams wrote:
Hmm; right. I was thinking that would reduce the spread/
effectiveness
of the hash algorithm, rather than allowing the algorithm to
understand
the size required and possibly incorporate that in a better
manner. I
appear to be overthinking. Bad Dan.
You already got here, but just to amplify your conclusion:
Python's dictionaries are extremely well optimised for most
reasonable hash table sizes (i.e. below millions, and they're even
reasonable at those sizes in my experience). Python uses
dictionaries *everywhere* internally, so they are given a lot of
attention by python-dev. Trying to code around them, e.g. by
computing hashes yourself, is almost always going to slow you down
compared to the highly optimised C code implementing the
dictionaries.
That is, if you have strings and want to index by them, just use
the
strings as keys and move onto the next task. That isn't to say the
dictionary implementation is perfect, but it's pretty hard to beat
*in Python* without some rare and significant application-specific
restrictions.
The same largely holds true for lists. There are cases where you
might want to use a different implementation, but most of the time
you want to assume "use a list" and only replace with a different
implementation if you discover a problem with the list-based
implementation.
Well, what I read between Dan's lines was that he does not actually
have a dictionary, but wants to select one of 500 values based on a
string hash. Another use would be, for example, to make a seed for a
random generator that then would make a repeatable sequence, perhaps
to assign a color for a nickname.
- Bert -
_______________________________________________
Sugar mailing list
[email protected]
http://mailman.laptop.org/mailman/listinfo/sugar