Robert Roessler:
I probably misunderstood your response a couple back in this thread:
"You are probably thinking of PropSet which should probably be
turned into a real hash table as the data sets it is used on are much
larger than it was originally written for."
Improving the hashing aspect of PropSet may be worthwhile but it
hasn't been a significant issue to me. Perhaps it is for others with
slower machines or more complex configurations. The small fixed size of
PropSet::props is more likely to be the limiting factor in its hashing
performance rather than the particular hash calculation performed.
The original HashString function body is 7 lines (too? is Python where
this came from?)...
No, I think the current one came from an '80s text book. Python's
is, as paraphrased in SinkWorld with UTF32 strings:
static int HashString(const int *s, int len) {
// Use the Python string hash algorithm
int ret = s[0] << 7;
for (int i=0; i<len; i++) {
ret *= 1000003;
ret ^= s[i];
}
ret ^= len;
return ret;
}
> 3 obvious responses:
1) is line-count a good determinant in evaluating hash algos?
2) source code size/layout may not be a good indicator of the length of
runtime code paths
Complexity for the code reader should be avoided. My reaction to the
Hsieh code was:
1) What's this preprocessor stuff? Is it alignment trickery?
2) It is reading 16 bit blocks - no, it is 32 bits at a time.
3) Is it reading past the end due to the blocking?
4) I hope those typedefs aren't duplicating stuff from system headers
leading to warnings when this is combined with other code.
After a while you can work the issues out but its not immediately
obvious. Extra complexity can be justified by performance and it would
be much more convincing if there was a measurement of either an overall
performance improvement (PropSet::Get 20% faster by changing hash
function) or something less direct (fewer empty buckets or shorter
chains on the distributed properties files or someone's particularly
large configuration).
Neil
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest