I want to equip my rsounds with a gen:equal+hash implementation, so that I can 
compare them using equal? and thereby use them in check-expect test cases.  I 
have to roll this myself, since s16vectors don't do non-eq? equal?. My 
question: is there an accepted hash function for an s16vector, or more 
generally, for a big block of memory? These vectors are relatively unlikely to 
be sparse--that is, have lots of zeros--so it seems like it would be reasonable 
just to take three or four values (in this case s16s) and just XOR them 
together.  

Taking a look at the behavior of vectors, though, it looks like *every* element 
is considered in computing the hash. This code:

#lang racket

(equal-hash-code (make-vector 100000 0))
(define v (make-vector 100000 0))
(vector-set! v 7982 1)
(equal-hash-code v)
(vector-set! v 7182 1)
(equal-hash-code v)
(vector-set! v 17982 1)
(equal-hash-code v)
(vector-set! v 55089 1)
(equal-hash-code v)

produces this output:

-1113903107084523788
369193263530696121
555229980507179346
2232258794360948312
-1281429753928160859

Which suggests that every change to the vector changes the result of the hash 
function. This seems... really expensive!  My current guess is that Racket uses 
a highly optimized (a.k.a. no safety checks) hash function that works over 
arbitrary blocks of data, but that the hash computation is still linear in the 
size of the data. I haven't tried it, but I would guess that if I wrote my own 
looks-at-every-value hash function, it would be a lot slower. Questions:

1) Am I guessing right?
2) Is this documented somewhere?
3) Is there a generic memory-hash function in the unsafe interface somewhere?
4) Does the hash function affect the time taken by 'equal?' -- i.e., the hash 
value is cached for faster equal? checking ?

Thanks!

John


_________________________
  Racket Developers list:
  http://lists.racket-lang.org/dev

Reply via email to