I actually forgot to attach the patch, so here it is. On 29 May 2016 at 22:27, lemonboy <[email protected]> wrote: > Hello, > While investigating the ticket #1293 [1] I noticed that the hashing > procedures would stop hashing all the vectors > at the 4th element and while that's a valuable way to speed up the > hashing process it causes some problems in > the wild. > > If you use a record with N > 4 slots as a key and then set! one of the > last slots after the 4th one the equal?-hash > procedure would return the same hash where equal? definitely fails. > > The attached patch fixes this problem by lifting the limit when > hashing tagged vectors aka records, the rationale > behind this decision is that the number of slots a record has is > usually very small and it's more likely to use a > record as a key instead of a huge vector. > > While inspecting this I also came across another issue, eq?-hash > crunches some plain values by just xor-ing them > with a random seed, while it calls hash-equal? for complex objects > like records/vectors, that's IMO quite a convoluted > way to implement it (and it might also be wrong as eq? performs a > shallow comparison only) that could be replaced by > stirring [2] the raw C_Word value as that's faster (and incidentally > also what guile does [3]). > (Rinse and repeat for the eqv? version) > > Cheers, > LemonBoy > > [3] > http://git.savannah.gnu.org/gitweb/?p=guile.git;a=blob;f=libguile/hash.c;h=d6ddb6b3b631f8f694171fe18eaa7dcb5325df1f;hb=HEAD > [2] https://gist.github.com/badboy/6267743 > [1] http://bugs.call-cc.org/ticket/1293
From b9e1b45232a7b8c73e8b726a8263225173626e05 Mon Sep 17 00:00:00 2001 From: LemonBoy <[email protected]> Date: Sun, 29 May 2016 12:03:11 +0200 Subject: [PATCH] Hash all the record slots in vector-hash To: [email protected]
Records are represented with tagged vectors and as such they are hashed by vector-hash that has a cutoff limit that's configurable using the *recursive-hash-max-length* parameter. Lift the restriction when the vector being hashed is a record so that all the slots are effectively hashed. --- srfi-69.scm | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/srfi-69.scm b/srfi-69.scm index bdc7f72..3594920 100644 --- a/srfi-69.scm +++ b/srfi-69.scm @@ -306,17 +306,20 @@ ; Recurse into some portion of the vector's slots (define (vector-hash obj seed depth start rnd) - (let ([len (##sys#size obj)]) - (let loop ([hsh (fx+ len (fxxor seed rnd))] - [i start] - [len (fx- (fxmax start (fxmin *recursive-hash-max-length* len)) start)] ) + (let* ((len (##sys#size obj)) + (struct? (##sys#generic-structure? obj)) + ; Make sure to hash all the slots when obj is a record + (max-length (if struct? len *recursive-hash-max-length*))) + (let loop ((hsh (fx+ len (fxxor seed rnd))) + (i start) + (len (fx- (fxmax start (fxmin max-length len)) start))) (if (fx= len 0) hsh (loop (fx+ hsh (fx+ (fxshl hsh 4) (recursive-hash (##sys#slot obj i) (fx+ depth 1) rnd))) (fx+ i 1) - (fx- len 1) ) ) ) ) ) + (fx- len 1)))))) ; Recurse into structured objects (define (recursive-hash obj depth rnd) -- 2.8.3
_______________________________________________ Chicken-hackers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/chicken-hackers
